This is a scraper that runs on Morph. To get started see the documentation

Contributors woodbine blablupcom

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.9, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.9 -----> Installing pip -----> Installing requirements with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 1))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to morph_defaults) to /app/.heroku/src/scraperwiki  Collecting lxml==3.4.4 (from -r /tmp/build/requirements.txt (line 2))  Downloading lxml-3.4.4.tar.gz (3.5MB)  Collecting cssselect==0.9.1 (from -r /tmp/build/requirements.txt (line 3))  Downloading cssselect-0.9.1.tar.gz  Collecting beautifulsoup4 (from -r /tmp/build/requirements.txt (line 4))  Downloading beautifulsoup4-4.6.0-py2-none-any.whl (86kB)  Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading dumptruck-0.1.6.tar.gz  Collecting requests (from scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading requests-2.18.4-py2.py3-none-any.whl (88kB)  Collecting idna<2.7,>=2.5 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading idna-2.6-py2.py3-none-any.whl (56kB)  Collecting urllib3<1.23,>=1.21.1 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading urllib3-1.22-py2.py3-none-any.whl (132kB)  Collecting certifi>=2017.4.17 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading certifi-2018.1.18-py2.py3-none-any.whl (151kB)  Collecting chardet<3.1.0,>=3.0.2 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading chardet-3.0.4-py2.py3-none-any.whl (133kB)  Installing collected packages: dumptruck, idna, urllib3, certifi, chardet, requests, scraperwiki, lxml, cssselect, beautifulsoup4  Running setup.py install for dumptruck: started  Running setup.py install for dumptruck: finished with status 'done'  Running setup.py develop for scraperwiki  Running setup.py install for lxml: started  Running setup.py install for lxml: still running...  Running setup.py install for lxml: finished with status 'done'  Running setup.py install for cssselect: started  Running setup.py install for cssselect: finished with status 'done'  Successfully installed beautifulsoup4-4.6.0 certifi-2018.1.18 chardet-3.0.4 cssselect-0.9.1 dumptruck-0.1.6 idna-2.6 lxml-3.4.4 requests-2.18.4 scraperwiki urllib3-1.22   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... /app/.heroku/python/lib/python2.7/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. The code that caused this warning is on line 24 of the file scraper.py. To get rid of this warning, change code that looks like this: BeautifulSoup(YOUR_MARKUP}) to this: BeautifulSoup(YOUR_MARKUP, "lxml") markup_type=markup_type)) /app/.heroku/python/lib/python2.7/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. The code that caused this warning is on line 36 of the file scraper.py. To get rid of this warning, change code that looks like this: BeautifulSoup(YOUR_MARKUP}) to this: BeautifulSoup(YOUR_MARKUP, "lxml") markup_type=markup_type)) Traceback (most recent call last): File "scraper.py", line 38, in <module> sublink = block.find('a', href=True) AttributeError: 'NoneType' object has no attribute 'find'

Data

Downloaded 523 times by SimKennedy woodbine MikeRalphson blablupcom

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (26 KB) Use the API

rows 10 / 93

Statistics

Average successful run time: 6 minutes

Total run time: about 1 month

Total cpu time used: about 1 hour

Total disk space used: 58.8 KB

History

  • Auto ran revision e868e857 and failed .
    nothing changed in the database
  • Auto ran revision e868e857 and failed .
    nothing changed in the database
    4 pages scraped
  • Auto ran revision e868e857 and failed .
    nothing changed in the database
    4 pages scraped
  • Auto ran revision e868e857 and completed successfully .
    93 records added, 93 records removed in the database
  • Auto ran revision e868e857 and completed successfully .
    93 records added, 93 records removed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

sp_E3720_WCC_gov / scraper.py