soit-sk / slovakia_parliament

Slovak parliament session transcripts database

Scrapes www.nrsr.sk

Stránka Národnej rady Slovenskej republiky. Národná rada Slovenskej republiky (ďalej len "národná rada") je jediným ústavodarným a zákonodarným orgánom Slovenskej republiky. Je orgánom štátnej moci a od jej primárneho postavenia v republike je odvodené postavenie ostatných štátnych orgánov. Ako volený orgán reprezentuje suverenitu štátu a ľudu. Plní závažnú úlohu pri budovaní Slovenskej republiky ako moderného a demokratického štátu a pri zavádzaní sociálnej a ekologicky orientovanej trhovej ekonomiky. Poslanci národnej rady sú volení vo všeobecných, rovných a priamych voľbách tajným hlasovaním. Počet poslancov je 150 a ich funkčné obdobie trvá štyri roky.


Slovak parliament session transcripts database

Scraper created at OpenScraper Challenge 2014, improved in 2015

Dependencies

Scrapes has few dependencies that are listed in requirements.txt file

Scraper

This is a scraper that runs on Morph. To get started see the documentation

Contributors mnagy katkad ricco386 lkundrak

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected -----> Installing python-2.7.6  $ pip install -r requirements.txt  Collecting requests==2.8.0 (from -r requirements.txt (line 1))  /app/.heroku/python/lib/python2.7/site-packages/pip-8.1.2-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.  SNIMissingWarning  /app/.heroku/python/lib/python2.7/site-packages/pip-8.1.2-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.  InsecurePlatformWarning  Downloading requests-2.8.0-py2.py3-none-any.whl (476kB)  Collecting scraperwiki==0.5.1 (from -r requirements.txt (line 2))  Downloading scraperwiki-0.5.1.tar.gz  Collecting beautifulsoup4==4.4.1 (from -r requirements.txt (line 3))  Downloading beautifulsoup4-4.4.1-py2-none-any.whl (81kB)  Collecting six (from scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading six-1.10.0-py2.py3-none-any.whl  Collecting sqlalchemy (from scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading SQLAlchemy-1.1.0.tar.gz (5.1MB)  Collecting alembic (from scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading alembic-0.8.8.tar.gz (970kB)  Collecting Mako (from alembic->scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading Mako-1.0.4.tar.gz (574kB)  Collecting python-editor>=0.3 (from alembic->scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading python-editor-1.0.1.tar.gz  Collecting MarkupSafe>=0.9.2 (from Mako->alembic->scraperwiki==0.5.1->-r requirements.txt (line 2))  Downloading MarkupSafe-0.23.tar.gz  Installing collected packages: requests, six, sqlalchemy, MarkupSafe, Mako, python-editor, alembic, scraperwiki, beautifulsoup4  Running setup.py install for sqlalchemy: started  Running setup.py install for sqlalchemy: finished with status 'done'  Running setup.py install for MarkupSafe: started  Running setup.py install for MarkupSafe: finished with status 'done'  Running setup.py install for Mako: started  Running setup.py install for Mako: finished with status 'done'  Running setup.py install for python-editor: started  Running setup.py install for python-editor: finished with status 'done'  Running setup.py install for alembic: started  Running setup.py install for alembic: finished with status 'done'  Running setup.py install for scraperwiki: started  Running setup.py install for scraperwiki: finished with status 'done'  Successfully installed Mako-1.0.4 MarkupSafe-0.23 alembic-0.8.8 beautifulsoup4-4.4.1 python-editor-1.0.1 requests-2.8.0 scraperwiki-0.5.1 six-1.10.0 sqlalchemy-1.1.0   ! Hello! It looks like your application is using an outdated version of Python.  ! This caused the security warning you saw above during the 'pip install' step.  ! We recommend 'python-2.7.12', which you can specify in a 'runtime.txt' file.  ! -- Much Love, Heroku.   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Got 14 rows for page #360 [{'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166279', 'time_to': datetime.datetime(2016, 9, 23, 16, 53, 26), 'time_from': datetime.datetime(2016, 9, 23, 16, 52, 31), 'meeting_number': 7, 'member': u'Koll\xe1r, Boris', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166279'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166240', 'time_to': datetime.datetime(2016, 9, 23, 16, 54, 6), 'time_from': datetime.datetime(2016, 9, 23, 16, 53, 26), 'meeting_number': 7, 'member': u'Zemanov\xe1, Anna', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166240'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166247', 'time_to': datetime.datetime(2016, 9, 23, 17, 14, 23), 'time_from': datetime.datetime(2016, 9, 23, 17, 2, 6), 'meeting_number': 7, 'member': u'Polia\u010dik, Martin', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166247'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166250', 'time_to': datetime.datetime(2016, 9, 23, 17, 21, 10), 'time_from': datetime.datetime(2016, 9, 23, 17, 14, 54), 'meeting_number': 7, 'member': u'Cig\xe1nikov\xe1, Jana', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166250'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166252', 'time_to': datetime.datetime(2016, 9, 23, 17, 42, 58), 'time_from': datetime.datetime(2016, 9, 23, 17, 21, 53), 'meeting_number': 7, 'member': u'Rajt\xe1r, Jozef', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166252'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166257', 'time_to': datetime.datetime(2016, 9, 23, 17, 45, 52), 'time_from': datetime.datetime(2016, 9, 23, 17, 43, 13), 'meeting_number': 7, 'member': u'\u0160uca, Peter', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166257'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166262', 'time_to': datetime.datetime(2016, 9, 23, 18, 5, 26), 'time_from': datetime.datetime(2016, 9, 23, 17, 53, 24), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166262'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166278', 'time_to': datetime.datetime(2016, 9, 23, 16, 53, 26), 'time_from': datetime.datetime(2016, 9, 23, 16, 52, 31), 'meeting_number': 7, 'member': u'Hrn\u010diar, Andrej', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166278'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166280', 'time_to': datetime.datetime(2016, 9, 23, 16, 53, 26), 'time_from': datetime.datetime(2016, 9, 23, 16, 52, 31), 'meeting_number': 7, 'member': u'Hrn\u010diar, Andrej', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166280'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166242', 'time_to': datetime.datetime(2016, 9, 23, 16, 54, 6), 'time_from': datetime.datetime(2016, 9, 23, 16, 53, 35), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166242'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166249', 'time_to': datetime.datetime(2016, 9, 23, 17, 14, 54), 'time_from': datetime.datetime(2016, 9, 23, 17, 8, 32), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166249'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166251', 'time_to': datetime.datetime(2016, 9, 23, 17, 21, 53), 'time_from': datetime.datetime(2016, 9, 23, 17, 14, 54), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166251'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166265', 'time_to': datetime.datetime(2016, 9, 23, 17, 43, 13), 'time_from': datetime.datetime(2016, 9, 23, 17, 38, 27), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166265'}, {'speech_video': u'http://tv.nrsr.sk/archiv/schodza/7/7?id=166266', 'time_to': datetime.datetime(2016, 9, 23, 17, 45, 53), 'time_from': datetime.datetime(2016, 9, 23, 17, 43, 14), 'meeting_number': 7, 'member': u'Bug\xe1r, B\xe9la', 'proceedings_video': u'http://tv.nrsr.sk/archiv/schodza/7/7', 'term_nr': 7, 'transcript': u'http://tv.nrsr.sk/transcript?id=166266'}] Got 0 rows for page #361 [] No data for page #361, ending

Data

Downloaded 6 times by jakubcevela loren MiroJanosik MikeRalphson

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (30.1 MB) Use the API

rows 10 / 48809

meeting_number time_from proceedings_video speech_video member term_nr time_to transcript
39
2014-10-29T15:58:47+00:00
Vážny, Ľubomír
6
2014-10-29T16:02:29+00:00
42
2014-11-08T22:32:27+00:00
Laššáková, Jana
6
2014-11-08T22:42:16+00:00
42
2014-11-08T22:42:16+00:00
Laššáková, Jana
6
2014-11-08T22:44:15+00:00
42
2014-11-08T22:44:15+00:00
Laššáková, Jana
6
2014-11-08T22:46:21+00:00
42
2014-11-08T22:47:24+00:00
Laššáková, Jana
6
2014-11-08T22:48:26+00:00
42
2014-11-08T22:48:26+00:00
Laššáková, Jana
6
2014-11-08T22:50:30+00:00
42
2014-11-08T22:50:30+00:00
Laššáková, Jana
6
2014-11-08T22:52:17+00:00
42
2014-11-08T22:52:17+00:00
Laššáková, Jana
6
2014-11-08T22:54:02+00:00
42
2014-11-08T22:54:02+00:00
Laššáková, Jana
6
2014-11-08T22:56:06+00:00
42
2014-11-08T22:56:06+00:00
Laššáková, Jana
6
2014-11-08T22:58:11+00:00

Statistics

Average successful run time: about 3 hours

Total run time: 3 days

Total cpu time used: about 1 hour

Total disk space used: 30.2 MB

History

  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    362 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    3033 pages scraped
  • Manually ran revision 1b8d366d and completed successfully .
    nothing changed in the database
    3095 pages scraped
  • Manually ran revision 4c92df28 and failed .
    nothing changed in the database
    7 pages scraped
  • Manually ran revision 4734532d and failed .
    nothing changed in the database
    2 pages scraped
  • Manually ran revision 4734532d and failed .
    42780 records added, 42780 records removed in the database
    2140 pages scraped
  • Manually ran revision ea6445cd and failed .
    nothing changed in the database
  • Manually ran revision 756eb81b and failed .
    48809 records added, 5720 records removed in the database
  • Manually ran revision abb4702a and failed .
    5720 records added in the database
  • Manually ran revision abb4702a and failed .
    44707 records added in the database
  • Manually ran revision abb4702a and failed .
    42391 records added in the database
  • Manually ran revision abb4702a and failed .
  • Manually ran revision abb4702a and failed .
  • Created on morph.io

Scraper code

Python

slovakia_parliament / scraper.py