blablupcom / HCA085_HACA_gov

Scrapes www.gov.uk

GOV.UK - The place to find government services and information - Simpler, clearer, faster


Last run completed successfully .

Console output of last run

Injecting configuration and compiling... -----> Python app detected -----> Stack changed, re-installing runtime -----> Installing runtime (python-2.7.6) -----> Installing dependencies with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r requirements.txt (line 1))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to morph_defaults) to ./.heroku/src/scraperwiki  Collecting lxml==3.4.4 (from -r requirements.txt (line 2))  Downloading lxml-3.4.4.tar.gz (3.5MB)  Building lxml version 3.4.4.  Building without Cython.  Using build configuration of libxslt 1.1.28  /app/.heroku/python/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'  warnings.warn(msg)  Collecting cssselect==0.9.1 (from -r requirements.txt (line 3))  Downloading cssselect-0.9.1.tar.gz  Collecting beautifulsoup4 (from -r requirements.txt (line 4))  Downloading beautifulsoup4-4.4.0-py2-none-any.whl (81kB)  Collecting dumptruck>=0.1.2 (from scraperwiki->-r requirements.txt (line 1))  Downloading dumptruck-0.1.6.tar.gz  Collecting requests (from scraperwiki->-r requirements.txt (line 1))  Downloading requests-2.7.0-py2.py3-none-any.whl (470kB)  Installing collected packages: requests, dumptruck, beautifulsoup4, cssselect, lxml, scraperwiki   Running setup.py install for dumptruck   Running setup.py install for cssselect  Running setup.py install for lxml  Building lxml version 3.4.4.  Building without Cython.  Using build configuration of libxslt 1.1.28  /app/.heroku/python/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'  warnings.warn(msg)  building 'lxml.etree' extension  gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/tmp/pip-build-8345yA/lxml/src/lxml/includes -I/app/.heroku/python/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w  gcc -pthread -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -L/app/.heroku/python/lib -lxslt -lexslt -lxml2 -lz -lm -lpython2.7 -o build/lib.linux-x86_64-2.7/lxml/etree.so  building 'lxml.objectify' extension  gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -I/tmp/pip-build-8345yA/lxml/src/lxml/includes -I/app/.heroku/python/include/python2.7 -c src/lxml/lxml.objectify.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -w  gcc -pthread -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -L/app/.heroku/python/lib -lxslt -lexslt -lxml2 -lz -lm -lpython2.7 -o build/lib.linux-x86_64-2.7/lxml/objectify.so  Running setup.py develop for scraperwiki  Creating /app/.heroku/python/lib/python2.7/site-packages/scraperwiki.egg-link (link to .)  Adding scraperwiki 0.3.7 to easy-install.pth file  Installed /app/.heroku/src/scraperwiki  Successfully installed beautifulsoup4-4.4.0 cssselect-0.9.1 dumptruck-0.1.6 lxml-3.4.4 requests-2.7.0 scraperwiki  -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2015_03 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2015_02 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2015_01 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_12 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_11 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_10 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_09 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_08 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_07 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_06 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_05 /app/.heroku/python/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning HCA085_HACA_gov_2014_04

Data

Downloaded 1 time by MikeRalphson

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (7 KB) Use the API

rows 10 / 12

d l f
2015-09-07 21:57:59.238015
HCA085_HACA_gov_2015_03
2015-09-07 21:58:01.826707
HCA085_HACA_gov_2015_02
2015-09-07 21:58:04.272447
HCA085_HACA_gov_2015_01
2015-09-07 21:58:05.014029
HCA085_HACA_gov_2014_12
2015-09-07 21:58:07.960852
HCA085_HACA_gov_2014_11
2015-09-07 21:58:10.702288
HCA085_HACA_gov_2014_10
2015-09-07 21:58:11.215990
HCA085_HACA_gov_2014_09
2015-09-07 21:58:13.636476
HCA085_HACA_gov_2014_08
2015-09-07 21:58:16.214616
HCA085_HACA_gov_2014_07
2015-09-07 21:58:19.078007
HCA085_HACA_gov_2014_06

Statistics

Average successful run time: 3 minutes

Total run time: 3 minutes

Total cpu time used: less than 5 seconds

Total disk space used: 29 KB

History

  • Manually ran revision c11719f8 and completed successfully .
    12 records added in the database
    13 pages scraped
  • Created on morph.io

Scraper code

Python

HCA085_HACA_gov / scraper.py