woodbine / sp_DOH033_NIFHA_gov

Scrapes www.nice.org.uk

Guidance, advice and information services for health, public health and social care professionals.


Contributors blablupcom woodbine

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.6, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.6 -----> Installing pip -----> Installing requirements with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 1))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to revision morph_defaults) to /app/.heroku/src/scraperwiki  Collecting lxml==3.4.4 (from -r /tmp/build/requirements.txt (line 2))  /app/.heroku/python/lib/python2.7/site-packages/pip/_vendor/urllib3/util/ssl_.py:339: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings  SNIMissingWarning  /app/.heroku/python/lib/python2.7/site-packages/pip/_vendor/urllib3/util/ssl_.py:137: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings  InsecurePlatformWarning  /app/.heroku/python/lib/python2.7/site-packages/pip/_vendor/urllib3/util/ssl_.py:137: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings  InsecurePlatformWarning  Downloading https://files.pythonhosted.org/packages/63/c7/4f2a2a4ad6c6fa99b14be6b3c1cece9142e2d915aa7c43c908677afc8fa4/lxml-3.4.4.tar.gz (3.5MB)  Collecting cssselect==0.9.1 (from -r /tmp/build/requirements.txt (line 3))  Downloading https://files.pythonhosted.org/packages/aa/e5/9ee1460d485b94a6d55732eb7ad5b6c084caf73dd6f9cb0bb7d2a78fafe8/cssselect-0.9.1.tar.gz  Collecting beautifulsoup4 (from -r /tmp/build/requirements.txt (line 4))  Downloading https://files.pythonhosted.org/packages/a6/29/bcbd41a916ad3faf517780a0af7d0254e8d6722ff6414723eedba4334531/beautifulsoup4-4.6.0-py2-none-any.whl (86kB)  Collecting python-dateutil (from -r /tmp/build/requirements.txt (line 5))  Downloading https://files.pythonhosted.org/packages/0c/57/19f3a65bcf6d5be570ee8c35a5398496e10a0ddcbc95393b2d17f86aaaf8/python_dateutil-2.7.2-py2.py3-none-any.whl (212kB)  Collecting requests[security] (from -r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/49/df/50aa1999ab9bde74656c2919d9c0c085fd2b3775fd3eca826012bef76d8c/requests-2.18.4-py2.py3-none-any.whl (88kB)  Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/15/27/3330a343de80d6849545b6c7723f8c9a08b4b104de964ac366e7e6b318df/dumptruck-0.1.6.tar.gz  Collecting six>=1.5 (from python-dateutil->-r /tmp/build/requirements.txt (line 5))  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl  Collecting chardet<3.1.0,>=3.0.2 (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)  Collecting certifi>=2017.4.17 (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl (150kB)  Collecting urllib3<1.23,>=1.21.1 (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl (132kB)  Collecting idna<2.7,>=2.5 (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/27/cc/6dd9a3869f15c2edfab863b992838277279ce92663d334df9ecf5106f5c6/idna-2.6-py2.py3-none-any.whl (56kB)  Collecting cryptography>=1.3.4; extra == "security" (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/b8/d2/34f54bf9459446965d0a4939ac872d6f82495cf16f48efc224af5de7f985/cryptography-2.2.2-cp27-cp27m-manylinux1_x86_64.whl (2.2MB)  Collecting pyOpenSSL>=0.14; extra == "security" (from requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/79/db/7c0cfe4aa8341a5fab4638952520d8db6ab85ff84505e12c00ea311c3516/pyOpenSSL-17.5.0-py2.py3-none-any.whl (53kB)  Collecting cffi>=1.7; platform_python_implementation != "PyPy" (from cryptography>=1.3.4; extra == "security"->requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/5d/a7/348bf05f004e7534012dc533ee29650d88fb25bf013988518e0acf6961fa/cffi-1.11.5-cp27-cp27m-manylinux1_x86_64.whl (407kB)  Collecting enum34; python_version < "3" (from cryptography>=1.3.4; extra == "security"->requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/c5/db/e56e6b4bbac7c4a06de1c50de6fe1ef3810018ae11732a50f15f62c7d050/enum34-1.1.6-py2-none-any.whl  Collecting ipaddress; python_version < "3" (from cryptography>=1.3.4; extra == "security"->requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/fc/d0/7fc3a811e011d4b388be48a0e381db8d990042df54aa4ef4599a31d39853/ipaddress-1.0.22-py2.py3-none-any.whl  Collecting asn1crypto>=0.21.0 (from cryptography>=1.3.4; extra == "security"->requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/ea/cd/35485615f45f30a510576f1a56d1e0a7ad7bd8ab5ed7cdc600ef7cd06222/asn1crypto-0.24.0-py2.py3-none-any.whl (101kB)  Collecting pycparser (from cffi>=1.7; platform_python_implementation != "PyPy"->cryptography>=1.3.4; extra == "security"->requests[security]->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/8c/2d/aad7f16146f4197a11f8e91fb81df177adcc2073d36a17b1491fd09df6ed/pycparser-2.18.tar.gz (245kB)  Installing collected packages: dumptruck, chardet, certifi, urllib3, idna, pycparser, cffi, enum34, six, ipaddress, asn1crypto, cryptography, pyOpenSSL, requests, scraperwiki, lxml, cssselect, beautifulsoup4, python-dateutil  Running setup.py install for dumptruck: started  Running setup.py install for dumptruck: finished with status 'done'  Running setup.py install for pycparser: started  Running setup.py install for pycparser: finished with status 'done'  Running setup.py develop for scraperwiki  Running setup.py install for lxml: started  Running setup.py install for lxml: still running...  Running setup.py install for lxml: finished with status 'done'  Running setup.py install for cssselect: started  Running setup.py install for cssselect: finished with status 'done'  Successfully installed asn1crypto-0.24.0 beautifulsoup4-4.6.0 certifi-2018.4.16 cffi-1.11.5 chardet-3.0.4 cryptography-2.2.2 cssselect-0.9.1 dumptruck-0.1.6 enum34-1.1.6 idna-2.6 ipaddress-1.0.22 lxml-3.4.4 pyOpenSSL-17.5.0 pycparser-2.18 python-dateutil-2.7.2 requests-2.18.4 scraperwiki six-1.11.0 urllib3-1.22   ! Hello! It looks like your application is using an outdated version of Python.  ! This caused the security warning you saw above during the 'pip install' step.  ! We recommend 'python-3.6.2', which you can specify in a 'runtime.txt' file.  ! -- Much Love, Heroku.   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... DOH033_NIFHA_gov_2018_03 DOH033_NIFHA_gov_2018_02 DOH033_NIFHA_gov_2018_01 DOH033_NIFHA_gov_2017_12 DOH033_NIFHA_gov_2017_11 DOH033_NIFHA_gov_2017_10 DOH033_NIFHA_gov_2017_09 DOH033_NIFHA_gov_2017_08 DOH033_NIFHA_gov_2017_07 DOH033_NIFHA_gov_2017_06 DOH033_NIFHA_gov_2017_05 DOH033_NIFHA_gov_2017_04 DOH033_NIFHA_gov_2017_03 DOH033_NIFHA_gov_2017_02 DOH033_NIFHA_gov_2017_01 DOH033_NIFHA_gov_2016_12 DOH033_NIFHA_gov_2016_11 DOH033_NIFHA_gov_2016_10 DOH033_NIFHA_gov_2016_09 DOH033_NIFHA_gov_2016_08 DOH033_NIFHA_gov_2016_07 DOH033_NIFHA_gov_2016_06 DOH033_NIFHA_gov_2016_05 DOH033_NIFHA_gov_2016_04 DOH033_NIFHA_gov_2016_03 DOH033_NIFHA_gov_2016_02 DOH033_NIFHA_gov_2016_01 DOH033_NIFHA_gov_2015_12 DOH033_NIFHA_gov_2015_11 DOH033_NIFHA_gov_2015_10 DOH033_NIFHA_gov_2015_09 DOH033_NIFHA_gov_2015_08 DOH033_NIFHA_gov_2015_07 DOH033_NIFHA_gov_2015_06 DOH033_NIFHA_gov_2015_05 DOH033_NIFHA_gov_2015_04 DOH033_NIFHA_gov_2015_03 DOH033_NIFHA_gov_2015_02 DOH033_NIFHA_gov_2015_01 DOH033_NIFHA_gov_2014_12 DOH033_NIFHA_gov_2014_11 DOH033_NIFHA_gov_2014_10 DOH033_NIFHA_gov_2014_09 DOH033_NIFHA_gov_2014_08 DOH033_NIFHA_gov_2014_07 DOH033_NIFHA_gov_2014_06 DOH033_NIFHA_gov_2014_05 DOH033_NIFHA_gov_2014_04 DOH033_NIFHA_gov_2014_03 DOH033_NIFHA_gov_2014_02 DOH033_NIFHA_gov_2014_01 DOH033_NIFHA_gov_2013_12 DOH033_NIFHA_gov_2013_11 DOH033_NIFHA_gov_2013_10 DOH033_NIFHA_gov_2013_09 DOH033_NIFHA_gov_2013_08 DOH033_NIFHA_gov_2013_07 DOH033_NIFHA_gov_2013_06 DOH033_NIFHA_gov_2013_05 DOH033_NIFHA_gov_2013_04 DOH033_NIFHA_gov_2013_03 DOH033_NIFHA_gov_2013_02 DOH033_NIFHA_gov_2013_01 DOH033_NIFHA_gov_2012_12 DOH033_NIFHA_gov_2012_11 DOH033_NIFHA_gov_2012_10 DOH033_NIFHA_gov_2012_09 DOH033_NIFHA_gov_2012_08 DOH033_NIFHA_gov_2012_07 DOH033_NIFHA_gov_2012_06 DOH033_NIFHA_gov_2012_05 DOH033_NIFHA_gov_2012_04

Data

Downloaded 513 times by SimKennedy woodbine MikeRalphson

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (40 KB) Use the API

rows 10 / 74

d l f
2015-09-20 03:00:03.559731
https://www.nice.org.uk/Media/Default/About/Who-we-are/Corporate-publications/Publication-scheme/Transparency-of-spend/Fiscal Year 2015-2016/Transparency-of-Spend-over-25k-April-2015.csv
DOH033_NIFHA_gov_2015_04
2015-11-08 23:32:18.885911
https://www.nice.org.uk/Media/Default/About/Who-we-are/Corporate-publications/Publication-scheme/Transparency-of-spend/Fiscal Year 2015-2016/transparency-of-spend-over-25k-october-2015.xls
DOH033_NIFHA_gov_2015_10
2018-04-26 21:19:31.952202
DOH033_NIFHA_gov_2018_03
2018-04-26 21:19:33.434349
DOH033_NIFHA_gov_2018_02
2018-04-26 21:19:34.906190
DOH033_NIFHA_gov_2018_01
2018-04-26 21:19:36.254410
DOH033_NIFHA_gov_2017_12
2018-04-26 21:19:37.763280
DOH033_NIFHA_gov_2017_11
2018-04-26 21:19:39.144421
DOH033_NIFHA_gov_2017_10
2018-04-26 21:19:40.454812
DOH033_NIFHA_gov_2017_09
2018-04-26 21:19:41.834601
DOH033_NIFHA_gov_2017_08

Statistics

Average successful run time: 3 minutes

Total run time: about 1 month

Total cpu time used: about 1 hour

Total disk space used: 70.1 KB

History

  • Auto ran revision b0111866 and completed successfully .
    72 records added, 72 records removed in the database
    76 pages scraped
  • Auto ran revision b0111866 and completed successfully .
    72 records added, 72 records removed in the database
    76 pages scraped
  • Auto ran revision b0111866 and completed successfully .
    72 records added, 72 records removed in the database
    76 pages scraped
  • Auto ran revision b0111866 and completed successfully .
    72 records added, 72 records removed in the database
  • Auto ran revision b0111866 and completed successfully .
    72 records added, 72 records removed in the database
    76 pages scraped
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

sp_DOH033_NIFHA_gov / scraper.py