mlandauer / fed_mp_data

Scrapes www.aph.gov.au

Home – Parliament of Australia


fedmpdata

A Morph scraper that grabs MP's details from APH.gov.au. Based off Henare's scraper.

TODO

  • tests
  • senators
  • states
  • open data

Contributors richygit

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling... -----> Ruby app detected -----> Compiling Ruby -----> Using Ruby version: ruby-2.0.0 -----> Installing dependencies using 1.7.12  Running: bundle install --without development:test --path vendor/bundle --binstubs vendor/bundle/bin -j4 --deployment  Fetching gem metadata from https://rubygems.org/.........  Installing Ascii85 1.0.2  Installing afm 0.2.2  Installing rake 10.1.0  Installing hashery 2.1.1  Installing httpclient 2.6.0.1  Installing mime-types 2.3  Installing net-http-digest_auth 1.4  Installing net-http-persistent 2.9.4  Installing mini_portile 0.6.0  Installing ntlm-http 0.1.1  Installing unf_ext 0.0.6  Installing webrobots 0.1.1  Installing ruby-rc4 0.1.5  Using bundler 1.7.12  Installing ttfunk 1.3.0  Installing unf 0.1.4  Installing pdf-reader 1.3.3  Installing domain_name 0.5.21  Installing http-cookie 1.0.2  Installing sqlite3 1.3.10  Installing sqlite_magic 0.0.5  Installing scraperwiki 3.0.2  Installing nokogiri 1.6.3.1  Installing mechanize 2.7.3  Your bundle is complete!  Gems in the groups development and test were not installed.  It was installed into ./vendor/bundle  Post-install message from pdf-reader:  ********************************************  v1.0.0 of PDF::Reader introduced a new page-based API. There are extensive  examples showing how to use it in the README and examples directory.  For detailed documentation, check the rdocs for the PDF::Reader,  PDF::Reader::Page and PDF::Reader::ObjectHash classes.  The old API is marked as deprecated but will continue to work with no  visible warnings for now.  ********************************************  Bundle completed (23.51s)  Cleaning up the bundler cache.   ###### WARNING:  You have not declared a Ruby version in your Gemfile.  To set your Ruby version add this line to your Gemfile:  ruby '2.0.0'  # See https://devcenter.heroku.com/articles/ruby-versions for more information.  -----> Discovering process types  Procfile declares types -> scraper  Default process types for Ruby -> rake, console Injecting scraper and running... I, [2015-06-15T10:29:36.836259 #11] INFO -- : Scraping CSV I, [2015-06-15T10:29:39.465684 #11] INFO -- : Scraping PDF /app/vendor/bundle/ruby/2.0.0/gems/pdf-reader-1.3.3/lib/pdf/reader/page_layout.rb:17:in `initialize' : undefined method `[]' for #<PDF::Reader::Reference:0x007f658d113768 @id=1134, @gen=0> (NoMethodError) from /app/vendor/bundle/ruby/2.0.0/gems/pdf-reader-1.3.3/lib/pdf/reader/page_text_receiver.rb:49:in `new' from /app/vendor/bundle/ruby/2.0.0/gems/pdf-reader-1.3.3/lib/pdf/reader/page_text_receiver.rb:49:in `content' from /app/vendor/bundle/ruby/2.0.0/gems/pdf-reader-1.3.3/lib/pdf/reader/page.rb:76:in `text' from /app/pdf_scraper.rb:20:in `block in scrape_pdf' from /app/pdf_scraper.rb:19:in `each' from /app/pdf_scraper.rb:19:in `scrape_pdf' from /app/pdf_scraper.rb:13:in `scrape' from /app/scraper_main.rb:48:in `main' from scraper.rb:3:in `<main>'

Statistics

Total run time: 1 minute

Total cpu time used: less than 5 seconds

Total disk space used: 112 MB

History

  • Manually ran revision 75bce6df and failed .
    nothing changed in the database
    3 pages scraped
  • Created on morph.io

Scraper code

Ruby

fed_mp_data / scraper.rb