soit-sk / slovakia_parliament

Slovak parliament session transcripts database


Slovak parliament session transcripts database

Scraper created at OpenScraper Challenge 2014, improved in 2015

Dependencies

Scrapes has few dependencies that are listed in requirements.txt file

Scraper

This is a scraper that runs on Morph. To get started see the documentation

Contributors mnagy katkad ricco386 lkundrak

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected -----> Installing python-2.7.12  $ pip install -r requirements.txt  Collecting requests==2.8.0 (from -r /tmp/build/requirements.txt (line 1))  Downloading requests-2.8.0-py2.py3-none-any.whl (476kB)  Collecting scraperwiki==0.5.1 (from -r /tmp/build/requirements.txt (line 2))  Downloading scraperwiki-0.5.1.tar.gz  Collecting beautifulsoup4==4.4.1 (from -r /tmp/build/requirements.txt (line 3))  Downloading beautifulsoup4-4.4.1-py2-none-any.whl (81kB)  Collecting six (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading six-1.10.0-py2.py3-none-any.whl  Collecting sqlalchemy (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading SQLAlchemy-1.1.8.tar.gz (5.2MB)  Collecting alembic (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading alembic-0.9.1.tar.gz (999kB)  Collecting Mako (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading Mako-1.0.6.tar.gz (575kB)  Collecting python-editor>=0.3 (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading python-editor-1.0.3.tar.gz  Collecting MarkupSafe>=0.9.2 (from Mako->alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 2))  Downloading MarkupSafe-1.0.tar.gz  Installing collected packages: requests, six, sqlalchemy, MarkupSafe, Mako, python-editor, alembic, scraperwiki, beautifulsoup4  Running setup.py install for sqlalchemy: started  Running setup.py install for sqlalchemy: finished with status 'done'  Running setup.py install for MarkupSafe: started  Running setup.py install for MarkupSafe: finished with status 'done'  Running setup.py install for Mako: started  Running setup.py install for Mako: finished with status 'done'  Running setup.py install for python-editor: started  Running setup.py install for python-editor: finished with status 'done'  Running setup.py install for alembic: started  Running setup.py install for alembic: finished with status 'done'  Running setup.py install for scraperwiki: started  Running setup.py install for scraperwiki: finished with status 'done'  Successfully installed Mako-1.0.6 MarkupSafe-1.0 alembic-0.9.1 beautifulsoup4-4.4.1 python-editor-1.0.3 requests-2.8.0 scraperwiki-0.5.1 six-1.10.0 sqlalchemy-1.1.8   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Got 0 rows for page #227 [] No data for page #227, ending

Data

Downloaded 8 times by loren Protesuiq jakubcevela MikeRalphson MiroJanosik

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (30.1 MB) Use the API

rows 10 / 48809

meeting_number time_from proceedings_video speech_video member term_nr time_to transcript
39
2014-10-29T15:58:47+00:00
Vážny, Ľubomír
6
2014-10-29T16:02:29+00:00
42
2014-11-08T22:32:27+00:00
Laššáková, Jana
6
2014-11-08T22:42:16+00:00
42
2014-11-08T22:42:16+00:00
Laššáková, Jana
6
2014-11-08T22:44:15+00:00
42
2014-11-08T22:44:15+00:00
Laššáková, Jana
6
2014-11-08T22:46:21+00:00
42
2014-11-08T22:47:24+00:00
Laššáková, Jana
6
2014-11-08T22:48:26+00:00
42
2014-11-08T22:48:26+00:00
Laššáková, Jana
6
2014-11-08T22:50:30+00:00
42
2014-11-08T22:50:30+00:00
Laššáková, Jana
6
2014-11-08T22:52:17+00:00
42
2014-11-08T22:52:17+00:00
Laššáková, Jana
6
2014-11-08T22:54:02+00:00
42
2014-11-08T22:54:02+00:00
Laššáková, Jana
6
2014-11-08T22:56:06+00:00
42
2014-11-08T22:56:06+00:00
Laššáková, Jana
6
2014-11-08T22:58:11+00:00

Statistics

Average successful run time: about 2 hours

Total run time: 3 days

Total cpu time used: about 1 hour

Total disk space used: 30.2 MB

History

  • Manually ran revision 252b9645 and completed successfully .
    nothing changed in the database
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    578 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    362 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    3033 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    3095 pages scraped
  • Manually ran revision 4c92df28 and failed .
    nothing changed in the database
    7 pages scraped
  • Manually ran revision 4734532d and failed .
    nothing changed in the database
    2 pages scraped
  • Manually ran revision 4734532d and failed .
    42780 records added, 42780 records removed in the database
    2140 pages scraped
  • Manually ran revision ea6445cd and failed .
    nothing changed in the database
  • Manually ran revision 756eb81b and failed .
    48809 records added, 5720 records removed in the database
  • Manually ran revision abb4702a and failed .
    5720 records added in the database
  • Manually ran revision abb4702a and failed .
    44707 records added in the database
  • Manually ran revision abb4702a and failed .
    42391 records added in the database
  • Manually ran revision abb4702a and failed .
  • Manually ran revision abb4702a and failed .
  • Created on morph.io

Scraper code

Python

slovakia_parliament / scraper.py