OddBloke / imdb-top-250

IMDb Top 250


This scraper works on http://akas.imdb.com/chart/top

It includes three tables:

  • data contains the most recent top 250 scraped,
  • scraper_run maps from time of scraping to a run id,
  • movie_rating is a list of all the rankings ever captured (with run ids)

Forked from ScraperWiki

Contributors OddBloke

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling... Injecting scraper and running... Traceback (most recent call last): File "scraper.py", line 102, in <module> main() File "scraper.py", line 89, in main session = _get_db_session() File "scraper.py", line 47, in _get_db_session Base.metadata.create_all(engine) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/schema.py", line 2148, in create_all bind.create(self, checkfirst=checkfirst, tables=tables) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1698, in create connection=connection, **kwargs) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1740, in _run_visitor **kwargs).traverse_single(element) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/sql/visitors.py", line 83, in traverse_single return meth(obj, **kw) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/ddl.py", line 36, in visit_metadata collection = [t for t in sql_util.sort_tables(tables) if self._can_create(t)] File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/ddl.py", line 29, in _can_create return not self.checkfirst or not self.dialect.has_table(self.connection, table.name, schema=table.schema) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py", line 427, in has_table cursor = _pragma_cursor(connection.execute("%stable_info(%s)" % (pragma, qtable))) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1191, in execute params) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1287, in _execute_text return self.__execute_context(context) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1302, in __execute_context context.parameters[0], context=context) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1401, in _cursor_execute context) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1394, in _cursor_execute context) File "/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 299, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.DatabaseError: (DatabaseError) database disk image is malformed 'PRAGMA table_info("scraper_run")' ()

Statistics

Average successful run time: 27 minutes

Total run time: 12 days

Total cpu time used: 3 days

Total disk space used: 11.5 MB

History

  • Auto ran revision b8cfa11b and failed .
  • Auto ran revision b8cfa11b and failed .
  • Auto ran revision b8cfa11b and failed .
  • Auto ran revision b8cfa11b and failed .
  • Auto ran revision b8cfa11b and failed .
  • ...
  • Forked from ScraperWiki

Show complete history

Scraper code

Python

imdb-top-250 / scraper.py