wfdd / colombia-senado-scraper

A scraper for the members of the Colombian Senate.


A morph.io scraper for the members of the Colombian Senate.

Contributors wfdd

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 3 is python-3.6.2 (you are using python-3.6.0, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-3.6.2).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-3.6.0 -----> Installing pip -----> Installing requirements with pip  Collecting aiohttp==1.1.5 (from -r /tmp/build/requirements.txt (line 1))  Downloading aiohttp-1.1.5.tar.gz (510kB)  Collecting async-timeout==1.1.0 (from -r /tmp/build/requirements.txt (line 2))  Downloading async_timeout-1.1.0-py3-none-any.whl  Collecting chardet==2.3.0 (from -r /tmp/build/requirements.txt (line 3))  Downloading chardet-2.3.0-py2.py3-none-any.whl (180kB)  Collecting lxml==3.6.4 (from -r /tmp/build/requirements.txt (line 4))  Downloading lxml-3.6.4.tar.gz (3.7MB)  Collecting multidict==2.1.2 (from -r /tmp/build/requirements.txt (line 5))  Downloading multidict-2.1.2.tar.gz (91kB)  Collecting uvloop==0.6.5 (from -r /tmp/build/requirements.txt (line 6))  Downloading uvloop-0.6.5.tar.gz (2.0MB)  Collecting yarl==0.7.1 (from -r /tmp/build/requirements.txt (line 7))  Downloading yarl-0.7.1.tar.gz (117kB)  Installing collected packages: chardet, multidict, async-timeout, yarl, aiohttp, lxml, uvloop  Running setup.py install for multidict: started  Running setup.py install for multidict: finished with status 'done'  Running setup.py install for yarl: started  Running setup.py install for yarl: finished with status 'done'  Running setup.py install for aiohttp: started  Running setup.py install for aiohttp: finished with status 'done'  Running setup.py install for lxml: started  Running setup.py install for lxml: still running...  Running setup.py install for lxml: finished with status 'done'  Running setup.py install for uvloop: started  Running setup.py install for uvloop: finished with status 'done'  Successfully installed aiohttp-1.1.5 async-timeout-1.1.0 chardet-2.3.0 lxml-3.6.4 multidict-2.1.2 uvloop-0.6.5 yarl-0.7.1   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Traceback (most recent call last): File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/connector.py", line 306, in connect yield from self._create_connection(req) File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/connector.py", line 585, in _create_connection transport, proto = yield from self._create_direct_connection(req) File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/connector.py", line 596, in _create_direct_connection hosts = yield from self._resolve_host(req.host, req.port) File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/connector.py", line 568, in _resolve_host self._resolver.resolve(host, port, family=self._family) File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/resolver.py", line 30, in resolve host, port, type=socket.SOCK_STREAM, family=family) socket.gaierror: [Errno -2] Name or service not known The above exception was the direct cause of the following exception: Traceback (most recent call last): File "scraper.py", line 159, in <module> main() File "scraper.py", line 149, in main session, asyncio.Semaphore(10, loop=loop))) File "uvloop/loop.pyx", line 1186, in uvloop.loop.Loop.run_until_complete (uvloop/loop.c:23889) File "scraper.py", line 132, in gather_people async with session.get(base_url + 'index.php/buscar-senador') as resp: File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/client.py", line 529, in __aenter__ self._resp = yield from self._coro File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/client.py", line 165, in _request conn = yield from self._connector.connect(req) File "/app/.heroku/python/lib/python3.6/site-packages/aiohttp/connector.py", line 316, in connect .format(key, exc.strerror)) from exc aiohttp.errors.ClientOSError: [Errno -2] Cannot connect to host www.secretariasenado.gov.co:80 ssl:False [Name or service not known]

Data

Downloaded 207 times by everypolitician wfdd

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (60 KB) Use the API

rows 10 / 101

id name image group term email website phone facebook twitter place_of_birth source
1331:acuna-diaz-laureano-augusto
ACUÑA DIAZ LAUREANO AUGUSTO
PARTIDO CONSERVADOR
2014
utl.laureanoacuna@senado.gov.co
3823262 - 3823263
192:aguilar-hurtado-nerthink-mauricio
NERTHINK MAURICIO AGUILAR HURTADO
PARTIDO DE INTEGRACIÓN NACIONAL
2014
3823000 EXT 3335-3334-4398
SenadorAguilar
Bucaramanga - Santander
1127:alvarez-montenegro-javier-tato
ÁLVAREZ MONTENEGRO JAVIER TATO
PARTIDO LIBERAL
2014
tomachch@yahoo.es
3823000 EXT 3531-3032
Pasto - Nariño
1126:amin-escaf-miguel
AMÍN ESCAF MIGUEL
PARTIDO DE LA U
2014
Sabalzakm@hotmail.com
3823000 EXT 3740
Barranquilla - Atlántico
1128:amin-hernandez-jaime-alejandro
AMIN HERNANDEZ JAIME ALEJANDRO
PARTIDO CENTRO DEMOCRATICO
2014
leidycasallas_27@hotmail.com
3823000 EXT 4444-4443
Barranquilla - Atlantico
1129:andrade-casama-luis-evelis
ANDRADE CASAMA LUIS EVELIS
PARTIDO MOVIMIENTO ALTERNATIVO INDIGENA Y SOCIAL “MAIS”
2014
utlsenadorleac@gmail.com
3823000 EXT 4347-4349
Luis_Evelis
Rio Sucio -  Choco
194:andrade-serrano-hernan-francisco
HERNÁN FRANCISCO ANDRADE SERRANO
PARTIDO CONSERVADOR
2014
hector.alfonso.lopez@senado.gov.co
3823000 EXT 3162-3163
AndradeSenador
Neiva - Huila
1131:araujo-rumie-fernando-nicolas
ARAUJO RUMIE FERNANDO NICOLAS
PARTIDO CENTRO DEMOCRÁTICO
2014
nicolas.araujo@senado.gov.co;caronader@hotmail.com
3823000 EXT 3358-3359
https://www.facebook.com/Senador Fernando Nicolás Araujo
FNAraujoR
Cartagena - Bolívar
71:ashton-giraldo-alvaro-antonio
ÁLVARO ANTONIO ASTHON GIRALDO
PARTIDO LIBERAL
2014
alvaroashton11@gmail.com
3823000 EXT 3345-3346-3407
ALVAROASHTON
Barranquilla - Atlántico
196:avirama-marco-anibal
MARCO ANIBAL AVIRAMA AVIRAMA
PARTIDO ALIANZA SOCIAL INDEPENDIENTE
2014
marco.avirama.avirama@senado.gov.co
3823000 EXT 4012-4013-4045
Puracé - Cauca

Statistics

Average successful run time: 3 minutes

Total run time: 1 day

Total cpu time used: 10 minutes

Total disk space used: 101 KB

History

  • Auto ran revision 665fde6b and failed .
    nothing changed in the database
  • Auto ran revision 665fde6b and completed successfully .
    101 records added, 101 records removed in the database
    202 pages scraped
  • Auto ran revision 665fde6b and completed successfully .
    101 records added, 101 records removed in the database
  • Auto ran revision 665fde6b and failed .
    nothing changed in the database
  • Auto ran revision 665fde6b and failed .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

colombia-senado-scraper / scraper.py