soit-sk / slovakia_parliament

Slovak parliament session transcripts database


Slovak parliament session transcripts database

Scraper created at OpenScraper Challenge 2014, improved in 2015

Dependencies

Scrapes has few dependencies that are listed in requirements.txt file

Scraper

This is a scraper that runs on Morph. To get started see the documentation

Contributors mnagy katkad ricco386 lkundrak

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.12, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.12 -----> Installing pip -----> Installing requirements with pip  DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support  Collecting requests==2.8.0 (from -r /tmp/build/requirements.txt (line 1))  Downloading https://files.pythonhosted.org/packages/5d/a6/90f822c17b4fc905da67aed49b511f110207242ff164aeda926461101dc6/requests-2.8.0-py2.py3-none-any.whl (476kB)  Collecting scraperwiki==0.5.1 (from -r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/30/84/d874847baad89f03e6984fcd87505a37bf924b66519d1e07bf76e2369af0/scraperwiki-0.5.1.tar.gz  Collecting beautifulsoup4==4.4.1 (from -r /tmp/build/requirements.txt (line 3))  Downloading https://files.pythonhosted.org/packages/33/62/f3e97eaa87fc4de0cb9b8c51d253cf0df621c6de6b25164dcbab203e5ff7/beautifulsoup4-4.4.1-py2-none-any.whl (81kB)  Collecting six (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl  Collecting sqlalchemy (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/55/98/56b7155bab287cd0c78dee26258835db36e91f2efef41f125ed6f6f1f334/SQLAlchemy-1.3.6.tar.gz (5.9MB)  Collecting alembic (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/7b/8b/0c98c378d93165d9809193f274c3c6e2151120d955b752419c7d43e4d857/alembic-1.0.11.tar.gz (1.0MB)  Collecting Mako (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/b0/3c/8dcd6883d009f7cae0f3157fb53e9afb05a0d3d33b3db1268ec2e6f4a56b/Mako-1.1.0.tar.gz (463kB)  Collecting python-editor>=0.3 (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/55/a0/3c0ba1c10f2ca381645dd46cb7afbb73fddc8de9f957e1f9e726a846eabc/python_editor-1.0.4-py2-none-any.whl  Collecting python-dateutil (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/41/17/c62faccbfbd163c7f57f3844689e3a78bae1f403648a6afb1d0866d87fbb/python_dateutil-2.8.0-py2.py3-none-any.whl (226kB)  Collecting MarkupSafe>=0.9.2 (from Mako->alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading https://files.pythonhosted.org/packages/d8/1f/e97c4c6b182e59562f99c207f0f621d15a42fc82a6532a98e0b2d38b7c4e/MarkupSafe-1.1.1-cp27-cp27m-manylinux1_x86_64.whl  Building wheels for collected packages: scraperwiki, sqlalchemy, alembic, Mako  Building wheel for scraperwiki (setup.py): started  Building wheel for scraperwiki (setup.py): finished with status 'done'  Created wheel for scraperwiki: filename=scraperwiki-0.5.1-cp27-none-any.whl size=6547 sha256=205a6fa7a457391a5ac469b87ff0954d17261a5e6809cad2348c02859bb3e483  Stored in directory: /tmp/pip-ephem-wheel-cache-soB5RG/wheels/6a/6e/60/e13b585339206922e816bb90c355b79aa077ab2b15d7cc26a7  Building wheel for sqlalchemy (setup.py): started  Building wheel for sqlalchemy (setup.py): finished with status 'done'  Created wheel for sqlalchemy: filename=SQLAlchemy-1.3.6-cp27-cp27m-linux_x86_64.whl size=1171604 sha256=37d10be12308e4cb243871903c5d04cbf1c0f9edc9d7737167f9b429706fd074  Stored in directory: /tmp/pip-ephem-wheel-cache-soB5RG/wheels/f2/ec/e0/d7deb0c981557e373edf7370574b7001690892afe5fea30c3c  Building wheel for alembic (setup.py): started  Building wheel for alembic (setup.py): finished with status 'done'  Created wheel for alembic: filename=alembic-1.0.11-py2.py3-none-any.whl size=162179 sha256=482afebd6d3b9372c62cf300952529ba2b2f5b7e5f9cf16c13c2ebfb976b8f1a  Stored in directory: /tmp/pip-ephem-wheel-cache-soB5RG/wheels/8b/65/b2/9837b4422d13e739c3324c428f1b3aa9e3c3df666bb420e4b3  Building wheel for Mako (setup.py): started  Building wheel for Mako (setup.py): finished with status 'done'  Created wheel for Mako: filename=Mako-1.1.0-cp27-none-any.whl size=75360 sha256=de472e083292968c8633364f74370866628008c61bb121d7bd383c566add9121  Stored in directory: /tmp/pip-ephem-wheel-cache-soB5RG/wheels/98/32/7b/a291926643fc1d1e02593e0d9e247c5a866a366b8343b7aa27  Successfully built scraperwiki sqlalchemy alembic Mako  Installing collected packages: requests, six, sqlalchemy, MarkupSafe, Mako, python-editor, python-dateutil, alembic, scraperwiki, beautifulsoup4  Successfully installed Mako-1.1.0 MarkupSafe-1.1.1 alembic-1.0.11 beautifulsoup4-4.4.1 python-dateutil-2.8.0 python-editor-1.0.4 requests-2.8.0 scraperwiki-0.5.1 six-1.12.0 sqlalchemy-1.3.6 DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support    -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Got 0 rows for page #2271 [] No data for page #2271, ending

Data

Downloaded 8 times by jakubcevela loren Protesuiq MikeRalphson MiroJanosik

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (30.1 MB) Use the API

rows 10 / 48809

meeting_number time_from proceedings_video speech_video member term_nr time_to transcript
39
2014-10-29T15:58:47+00:00
Vážny, Ľubomír
6
2014-10-29T16:02:29+00:00
42
2014-11-08T22:32:27+00:00
Laššáková, Jana
6
2014-11-08T22:42:16+00:00
42
2014-11-08T22:42:16+00:00
Laššáková, Jana
6
2014-11-08T22:44:15+00:00
42
2014-11-08T22:44:15+00:00
Laššáková, Jana
6
2014-11-08T22:46:21+00:00
42
2014-11-08T22:47:24+00:00
Laššáková, Jana
6
2014-11-08T22:48:26+00:00
42
2014-11-08T22:48:26+00:00
Laššáková, Jana
6
2014-11-08T22:50:30+00:00
42
2014-11-08T22:50:30+00:00
Laššáková, Jana
6
2014-11-08T22:52:17+00:00
42
2014-11-08T22:52:17+00:00
Laššáková, Jana
6
2014-11-08T22:54:02+00:00
42
2014-11-08T22:54:02+00:00
Laššáková, Jana
6
2014-11-08T22:56:06+00:00
42
2014-11-08T22:56:06+00:00
Laššáková, Jana
6
2014-11-08T22:58:11+00:00

Statistics

Average successful run time: about 2 hours

Total run time: 3 days

Total cpu time used: about 1 hour

Total disk space used: 30.2 MB

History

  • Manually ran revision 252b9645 and completed successfully .
    nothing changed in the database
  • Manually ran revision 252b9645 and completed successfully .
    nothing changed in the database
  • Manually ran revision 252b9645 and completed successfully .
    nothing changed in the database
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    578 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    362 pages scraped
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

slovakia_parliament / scraper.py