bry0n969 / dsfsD

adsf

Scrapes www.kopavogur.is and www.rbht.nhs.uk

Kópavogsbær | Kópavogur.is


Contributors bry0n969

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> PHP app detected -----> Bootstrapping... -----> Installing platform packages...  NOTICE: No runtime required in composer.lock; using PHP ^5.5.17  - php (5.6.30)  - ext-gd (bundled with php)  - ext-mbstring (bundled with php)  - ext-pdo_sqlite (bundled with php)  - ext-sqlite3 (bundled with php)  - apache (2.4.20)  - nginx (1.8.1) -----> Installing dependencies...  Composer version 1.1.3 2016-06-26 15:42:08  Loading composer repositories with package information  Installing dependencies from lock file  - Installing openaustralia/scraperwiki (dev-morph_defaults e996fe0)  Cloning e996fe0253bb50330690f5d2bafb66f094dbacb8   Generating optimized autoload files -----> Preparing runtime environment... -----> Checking for additional extensions to install...  -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... import scraperwiki # Blank Python import scraperwiki import lxml.html urls = ["http://www.ebay.com/sch/m.html?_nkw=&_armrs=1&_from=&_ssn=offroadbelts&_pgn=2&_skc=200&rt=nc"] max_pages = 10000 for wurl in urls: curr_url = wurl page_idx = 1 while page_idx <= max_pages : error = True while error: try: html = scraperwiki.scrape(curr_url) root = lxml.html.fromstring(html) for tr in root.cssselect("div[class='ittl'] a"): url = tr.get("href") html = scraperwiki.scrape(url) if html.find("channeladvisor_poweredby-en.gif") != -1 : root2 = lxml.html.fromstring(html) for mname in root2.cssselect("span[class='mbg-nw']"): data = { 'url': url, 'merchant_name': mname.text } scraperwiki.sqlite.save(unique_keys=['url'],data=data) for next_page in root.cssselect("td[class='botpg-next'] a"): print curr_url curr_url = next_page.get("href") page_idx = page_idx +1 error = False except: print 'error' error = True

Statistics

Average successful run time: half a minute

Total run time: half a minute

Total cpu time used: less than 5 seconds

Total disk space used: 23 KB

History

  • Manually ran revision 9c01b5a8 and completed successfully .
    nothing changed in the database
    4 pages scraped
  • Created on morph.io

Scraper code

dsfsD