blablupcom / manta

Scrapes www.gov.uk

GOV.UK - The place to find government services and information - Simpler, clearer, faster


Contributors blablupcom

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling... Injecting scraper and running... /app/.heroku/python/lib/python2.7/site-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. To get rid of this warning, change this: BeautifulSoup([your markup]) to this: BeautifulSoup([your markup], "html.parser") markup_type=markup_type)) <!DOCTYPE html> <html> <head> <title>Pardon Our Interruption</title> <link href="//cdn.distilnetworks.com/css/distil.css" media="all" rel="stylesheet" type="text/css"> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <meta content="width=1000" name="viewport"/> <meta content="noindex, nofollow" name="robots"> <meta content="max-age=0" http-equiv="cache-control"/> <meta content="no-cache" http-equiv="cache-control"/> <meta content="0" http-equiv="expires"/> <meta content="Tue, 01 Jan 1980 1:00:00 GMT" http-equiv="expires"/> <meta content="no-cache" http-equiv="pragma"/> </meta></link></head> <body class="block-page"> <div class="container"> <div class="row"> <div class="sidebar col-lg-4 col-sm-5"> <img alt="0" src="//cdn.distilnetworks.com/images/anomaly-detected.png"> </img></div> <div class="content col-lg-8 col-sm-7"> <h1>Pardon Our Interruption...</h1> <p> As you were browsing <strong>http://www.manta.com</strong> something about your browser made us think you were a bot. There are a few reasons this might happen: </p> <ul> <li>You're a power user moving through this website with super-human speed.</li> <li>You've disabled JavaScript in your web browser.</li> <li>A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this <a href="https://support.distilnetworks.com/customer/portal/articles/1842381-third-party-browser-plugins-that-block-javascript" target="_blank" title="Third party browser plugins that block javascript">support article</a>.</li> </ul> <p> To request an unblock, please fill out the form below and we will review it as soon as possible. </p> <form action="axcvadyqeqqbscxvrfaxcva.html" id="bayeqqswaabxsf" method="POST" style="display:none"><label>Ignore: <input name="name" type="text"/></label><label>Ignore: <input name="email" type="text"/></label><label>Ignore: <input type="submit" value="Submit"/></label></form><form action="http://verify.distil.it/distil_blocked.php" id="demoForm" method="post"> <div class="form-group"> <label for="first_name">First Name</label> <input class="form-control" id="first_name" name="first_name" type="text" value=""> </input></div> <div class="form-group"> <label for="last_name">Last Name</label> <input class="form-control" id="last_name" name="last_name" type="text" value=""> </input></div> <div class="form-group"> <label for="email">E-mail</label> <input class="form-control" id="email" name="email" type="text" value=""> </input></div> <div class="form-group hide"> <label for="city">City</label> <input class="form-control hide" id="city" name="city" type="text" value=""> </input></div> <input name="B" type="hidden" value="2514:50.116.3.88:1E9B0FF7-9E1F-379F-A90E-F22277DBECF9"/> <input name="P" type="hidden" value="1E9B0FF7-9E1F-379F-A90E-F22277DBECF9"/> <input name="I" type="hidden" value=""/> <input name="U" type="hidden" value=""/> <input name="V" type="hidden" value="9"/> <input name="O" type="hidden" value=""/> <input name="D" type="hidden" value="2514"/> <input name="A" type="hidden" value="589"/> <input name="H" type="hidden" value="www.manta.com"/> <input name="LOADED" type="hidden" value="2015-08-02 13:14:06"/> <input id="distil_block_identity_info" name="PB" type="hidden" value=""/> <hr> <button class="btn btn-primary btn-lg" type="submit">Request Unblock</button> </hr></form> <p id="extraUnblock"> <small style="font-size: 8pt"> You reached this page when attempting to access http://www.manta.com/world/Oceania/Australia/ from 50.116.3.88 on 2015-08-02 13:14:06 GMT.<br/> Trace: 5A88AEB8-3918-11E5-B143-FCFD419E1EA2 via 94fab34c-ba88-4194-bb82-95bc7f22bea4 </small> </p> </div> </div> </div> </body> </html> Traceback (most recent call last): File "scraper.py", line 8, in <module> title = s.find('span', attrs={'itemprop':'title'}).text AttributeError: 'NoneType' object has no attribute 'text'

Statistics

Total run time: 7 minutes

Total cpu time used: less than 10 seconds

Total disk space used: 24.3 KB

History

  • Manually ran revision 16256ade and failed .
    nothing changed in the database
    2 pages scraped
  • Manually ran revision e84bc476 and failed .
    nothing changed in the database
    1 page scraped
  • Manually ran revision 3cce7be8 and failed .
    nothing changed in the database
  • Manually ran revision 6d8e0fc0 and failed .
    nothing changed in the database
  • Manually ran revision 472f7558 and failed .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

manta