mobeets / wikipedia-signpost

new posts on Wikipedia's The Signpost


Checks for new articles in Wikipedia's The Signpost.

This is a scraper that runs on Morph.

Contributors mobeets

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.9, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.9 -----> Installing pip -----> Installing requirements with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 6))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to revision morph_defaults) to /app/.heroku/src/scraperwiki  Collecting lxml==3.4.4 (from -r /tmp/build/requirements.txt (line 8))  Downloading https://files.pythonhosted.org/packages/63/c7/4f2a2a4ad6c6fa99b14be6b3c1cece9142e2d915aa7c43c908677afc8fa4/lxml-3.4.4.tar.gz (3.5MB)  Collecting cssselect==0.9.1 (from -r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/aa/e5/9ee1460d485b94a6d55732eb7ad5b6c084caf73dd6f9cb0bb7d2a78fafe8/cssselect-0.9.1.tar.gz  Collecting BeautifulSoup==3.2.1 (from -r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/1e/ee/295988deca1a5a7accd783d0dfe14524867e31abb05b6c0eeceee49c759d/BeautifulSoup-3.2.1.tar.gz  Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/15/27/3330a343de80d6849545b6c7723f8c9a08b4b104de964ac366e7e6b318df/dumptruck-0.1.6.tar.gz  Collecting requests (from scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/49/df/50aa1999ab9bde74656c2919d9c0c085fd2b3775fd3eca826012bef76d8c/requests-2.18.4-py2.py3-none-any.whl (88kB)  Collecting idna<2.7,>=2.5 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/27/cc/6dd9a3869f15c2edfab863b992838277279ce92663d334df9ecf5106f5c6/idna-2.6-py2.py3-none-any.whl (56kB)  Collecting urllib3<1.23,>=1.21.1 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl (132kB)  Collecting certifi>=2017.4.17 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl (150kB)  Collecting chardet<3.1.0,>=3.0.2 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)  Installing collected packages: dumptruck, idna, urllib3, certifi, chardet, requests, scraperwiki, lxml, cssselect, BeautifulSoup  Running setup.py install for dumptruck: started  Running setup.py install for dumptruck: finished with status 'done'  Running setup.py develop for scraperwiki  Running setup.py install for lxml: started  Running setup.py install for lxml: still running...  Running setup.py install for lxml: finished with status 'done'  Running setup.py install for cssselect: started  Running setup.py install for cssselect: finished with status 'done'  Running setup.py install for BeautifulSoup: started  Running setup.py install for BeautifulSoup: finished with status 'done'  Successfully installed BeautifulSoup-3.2.1 certifi-2018.4.16 chardet-3.0.4 cssselect-0.9.1 dumptruck-0.1.6 idna-2.6 lxml-3.4.4 requests-2.18.4 scraperwiki urllib3-1.22   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Traceback (most recent call last): File "scraper.py", line 19, in <module> vs = [{'name': x.span.text, 'url': BASEURL + x['href']} for x in base.findAll('a') if x['href'].count('/wiki/Wikipedia:Wikipedia_Signpost/20') and x.span] File "/app/.heroku/python/lib/python2.7/site-packages/BeautifulSoup.py", line 613, in __getitem__ return self._getAttrMap()[key] KeyError: 'href'

Data

Downloaded 4 times by MikeRalphson

Statistics

Average successful run time: 1 minute

Total run time: about 21 hours

Total cpu time used: 8 minutes

Total disk space used: 81.5 KB

History

  • Auto ran revision 13b8adbf and failed .
    nothing changed in the database
  • Auto ran revision 13b8adbf and failed .
    nothing changed in the database
  • Auto ran revision 13b8adbf and failed .
    nothing changed in the database
  • Auto ran revision 13b8adbf and failed .
    nothing changed in the database
  • Auto ran revision 13b8adbf and failed .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

wikipedia-signpost / scraper.py