woodbine / sp_E1801_HC_gov

Scrapes www.herefordshire.gov.uk

The Herefordshire Council Homepage


Contributors blablupcom woodbine

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling...  -----> Python app detected -----> Installing python-2.7.14 -----> Installing pip -----> Installing requirements with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 1))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to revision morph_defaults) to /app/.heroku/src/scraperwiki  Collecting lxml==3.4.4 (from -r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/63/c7/4f2a2a4ad6c6fa99b14be6b3c1cece9142e2d915aa7c43c908677afc8fa4/lxml-3.4.4.tar.gz (3.5MB)  Collecting cssselect==0.9.1 (from -r /tmp/build/requirements.txt (line 3))  Downloading https://files.pythonhosted.org/packages/aa/e5/9ee1460d485b94a6d55732eb7ad5b6c084caf73dd6f9cb0bb7d2a78fafe8/cssselect-0.9.1.tar.gz  Collecting beautifulsoup4 (from -r /tmp/build/requirements.txt (line 4))  Downloading https://files.pythonhosted.org/packages/a6/29/bcbd41a916ad3faf517780a0af7d0254e8d6722ff6414723eedba4334531/beautifulsoup4-4.6.0-py2-none-any.whl (86kB)  Collecting python-dateutil (from -r /tmp/build/requirements.txt (line 5))  Downloading https://files.pythonhosted.org/packages/cf/f5/af2b09c957ace60dcfac112b669c45c8c97e32f94aa8b56da4c6d1682825/python_dateutil-2.7.3-py2.py3-none-any.whl (211kB)  Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/15/27/3330a343de80d6849545b6c7723f8c9a08b4b104de964ac366e7e6b318df/dumptruck-0.1.6.tar.gz  Collecting requests (from scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/65/47/7e02164a2a3db50ed6d8a6ab1d6d60b69c4c3fdf57a284257925dfc12bda/requests-2.19.1-py2.py3-none-any.whl (91kB)  Collecting six>=1.5 (from python-dateutil->-r /tmp/build/requirements.txt (line 5))  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl  Collecting idna<2.8,>=2.5 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB)  Collecting certifi>=2017.4.17 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl (150kB)  Collecting urllib3<1.24,>=1.21.1 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/bd/c9/6fdd990019071a4a32a5e7cb78a1d92c53851ef4f56f62a3486e6a7d8ffb/urllib3-1.23-py2.py3-none-any.whl (133kB)  Collecting chardet<3.1.0,>=3.0.2 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)  Installing collected packages: dumptruck, idna, certifi, urllib3, chardet, requests, scraperwiki, lxml, cssselect, beautifulsoup4, six, python-dateutil  Running setup.py install for dumptruck: started  Running setup.py install for dumptruck: finished with status 'done'  Running setup.py develop for scraperwiki  Running setup.py install for lxml: started  Running setup.py install for lxml: still running...  Running setup.py install for lxml: finished with status 'done'  Running setup.py install for cssselect: started  Running setup.py install for cssselect: finished with status 'done'  Successfully installed beautifulsoup4-4.6.0 certifi-2018.4.16 chardet-3.0.4 cssselect-0.9.1 dumptruck-0.1.6 idna-2.7 lxml-3.4.4 python-dateutil-2.7.3 requests-2.19.1 scraperwiki six-1.11.0 urllib3-1.23   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... E1801_HC_gov_2015_12 E1801_HC_gov_2015_11 E1801_HC_gov_2015_10 E1801_HC_gov_2015_09 E1801_HC_gov_2015_08 E1801_HC_gov_2015_07 E1801_HC_gov_2015_06 E1801_HC_gov_2015_05 E1801_HC_gov_2015_04 E1801_HC_gov_2015_03 E1801_HC_gov_2015_02 E1801_HC_gov_2015_01 E1801_HC_gov_2013_04 E1801_HC_gov_2013_05 E1801_HC_gov_2013_06 E1801_HC_gov_2013_07 E1801_HC_gov_2013_08 E1801_HC_gov_2013_09 E1801_HC_gov_2013_10 E1801_HC_gov_2013_11 E1801_HC_gov_2013_12 E1801_HC_gov_2014_12 E1801_HC_gov_2014_11 E1801_HC_gov_2014_10 E1801_HC_gov_2014_09 E1801_HC_gov_2014_08 E1801_HC_gov_2014_07 E1801_HC_gov_2014_06 E1801_HC_gov_2014_05 E1801_HC_gov_2014_04 E1801_HC_gov_2014_03 E1801_HC_gov_2014_02 E1801_HC_gov_2014_01 E1801_HC_gov_2016_12 E1801_HC_gov_2016_11 E1801_HC_gov_2016_10 E1801_HC_gov_2016_09 E1801_HC_gov_2016_07 E1801_HC_gov_2016_06 E1801_HC_gov_2016_05 E1801_HC_gov_2016_04 E1801_HC_gov_2016_03 E1801_HC_gov_2016_02 E1801_HC_gov_2016_01 E1801_HC_gov_2016_08 E1801_HC_gov_2017_01 E1801_HC_gov_2017_02 E1801_HC_gov_2017_03 E1801_HC_gov_2017_Q2 E1801_HC_gov_2017_Q3 E1801_HC_gov_2017_Q4 E1801_HC_gov_2018_01 E1801_HC_gov_2018_02 E1801_HC_gov_2018_03 E1801_HC_gov_2018_04 E1801_HC_gov_2018_04 *Error: Invalid filetype* https://www.herefordshire.gov.uk/download/downloads/id/14216/expenditure_for_april_2018_pdf.pdf E1801_HC_gov_2018_05 Traceback (most recent call last): E1801_HC_gov_2018_05 *Error: Invalid filetype* File "scraper.py", line 142, in <module> https://www.herefordshire.gov.uk/download/downloads/id/14337/expenditure_for_may_2018_pdf.pdf raise Exception("%d errors occurred during scrape." % errors) Exception: 2 errors occurred during scrape.

Data

Downloaded 519 times by SimKennedy woodbine MikeRalphson

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (20 KB) Use the API

rows 10 / 56

Statistics

Average successful run time: 2 minutes

Total run time: 15 days

Total cpu time used: about 1 hour

Total disk space used: 52.3 KB

History

  • Auto ran revision 859d3c30 and failed .
    56 records added, 56 records removed in the database
    117 pages scraped
  • Auto ran revision 859d3c30 and failed .
    56 records added, 56 records removed in the database
    117 pages scraped
  • Auto ran revision 859d3c30 and failed .
    56 records added, 56 records removed in the database
    117 pages scraped
  • Auto ran revision 859d3c30 and failed .
    56 records added, 56 records removed in the database
  • Auto ran revision 859d3c30 and failed .
    56 records added, 56 records removed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

sp_E1801_HC_gov / scraper.py