packt-joeld / how-to-write-a-screen-scraper-4

How to Write a Screen Scraper: 3


Contributors packt-joeld

Last run completed successfully .

Console output of last run

Injecting configuration and compiling... Injecting scraper and running... <html> <head> <title>Scrape this table of best selling albums</title> </head> <body> <h1>Scrape this table of best selling albums</h1> <table class="data"> <thead><tr class="tableizer-firstrow"><th>Artist</th><th>Album</th><th>Released</th><th>Empty column</th><th>Sales m</th></tr></thead><tbody> <tr><td>Michael Jackson</td><td>Thriller</td><td>1982</td><td>&nbsp;</td><td>110</td></tr> <tr><td>AC/DC</td><td>Back in Black</td><td>1980</td><td>&nbsp;</td><td>49</td></tr> <tr><td>Pink Floyd</td><td>The Dark Side of the Moon</td><td>1973</td><td>&nbsp;</td><td>45</td></tr> <tr><td>Whitney Houston / Various artists</td><td>The Bodyguard</td><td>1992</td><td>&nbsp;</td><td>44</td></tr> <tr><td>Meat Loaf </td><td>Bat Out of Hell</td><td>1977</td><td>&nbsp;</td><td>43</td></tr> <tr><td>Eagles</td><td>Their Greatest Hits (1971â€"1975)</td><td>1976</td><td>&nbsp;</td><td>42</td></tr> </tbody></table> <p> <a class="next" href="/scraping-for-everyone/webpages/example_table_2.html">next page</a> </p> </body> </html> {'Released': '1982', 'Album': 'Thriller', 'Artist': 'Michael Jackson', 'Sales m': '110'} ------------ {'Released': '1980', 'Album': 'Back in Black', 'Artist': 'AC/DC', 'Sales m': '49'} ------------ {'Released': '1973', 'Album': 'The Dark Side of the Moon', 'Artist': 'Pink Floyd', 'Sales m': '45'} ------------ {'Released': '1992', 'Album': 'The Bodyguard', 'Artist': 'Whitney Houston / Various artists', 'Sales m': '44'} ------------ {'Released': '1977', 'Album': 'Bat Out of Hell', 'Artist': 'Meat Loaf ', 'Sales m': '43'} ------------ {'Released': '1976', 'Album': u'Their Greatest Hits (1971\xc3\xa2\xe2\x82\xac"1975)', 'Artist': 'Eagles', 'Sales m': '42'} ------------ [<Element a at 0x13b3ef0>] https://paulbradshaw.github.io/scraping-for-everyone/webpages/example_table_2.html <html> <head> <title>Scrape this table of best selling albums</title> </head> <body> <h1>Scrape this table of best selling albums</h1> <table class="data"> <thead><tr class="tableizer-firstrow"><th>Artist</th><th>Album</th><th>Released</th><th>Empty column</th><th>Sales m</th></tr></thead><tbody> <tr><td>Various artists</td><td>Dirty Dancing</td><td>1987</td><td>&nbsp;</td><td>42</td></tr> <tr><td>Fleetwood Mac</td><td>Rumours</td><td>1977</td><td>&nbsp;</td><td>40</td></tr> <tr><td>Backstreet Boys</td><td>Millennium</td><td>1999</td><td>&nbsp;</td><td>40</td></tr> <tr><td>Bee Gees / Various artists</td><td>Saturday Night Fever</td><td>1977</td><td>&nbsp;</td><td>40</td></tr> <tr><td>Shania Twain</td><td>Come On Over</td><td>1997</td><td>&nbsp;</td><td>39</td></tr> <tr><td>Led Zeppelin</td><td>Led Zeppelin IV</td><td>1971</td><td>&nbsp;</td><td>37</td></tr> </tbody></table> <p> <a class="previous" href="/scraping-for-everyone/webpages/example_table_1.html">previous page</a> </p> </body> </html> {'Released': '1987', 'Album': 'Dirty Dancing', 'Artist': 'Various artists', 'Sales m': '42'} ------------ {'Released': '1977', 'Album': 'Rumours', 'Artist': 'Fleetwood Mac', 'Sales m': '40'} ------------ {'Released': '1999', 'Album': 'Millennium', 'Artist': 'Backstreet Boys', 'Sales m': '40'} ------------ {'Released': '1977', 'Album': 'Saturday Night Fever', 'Artist': 'Bee Gees / Various artists', 'Sales m': '40'} ------------ {'Released': '1997', 'Album': 'Come On Over', 'Artist': 'Shania Twain', 'Sales m': '39'} ------------ {'Released': '1971', 'Album': 'Led Zeppelin IV', 'Artist': 'Led Zeppelin', 'Sales m': '37'} ------------ []

Data

Downloaded 0 times

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (4 KB) Use the API

rows 10 / 12

date_scraped Album Released Sales m Artist
Thriller
1982
110
Michael Jackson
Back in Black
1980
49
AC/DC
The Dark Side of the Moon
1973
45
Pink Floyd
The Bodyguard
1992
44
Whitney Houston / Various artists
Bat Out of Hell
1977
43
Meat Loaf
Their Greatest Hits (1971â€"1975)
1976
42
Eagles
Dirty Dancing
1987
42
Various artists
Rumours
1977
40
Fleetwood Mac
Millennium
1999
40
Backstreet Boys
Saturday Night Fever
1977
40
Bee Gees / Various artists

Statistics

Average successful run time: less than a minute

Total run time: 5 minutes

Total cpu time used: less than 5 seconds

Total disk space used: 54 KB

History

  • Manually ran revision 4e04f237 and completed successfully .
    12 records added, 12 records removed in the database
  • Manually ran revision 399083e8 and failed .
    nothing changed in the database
  • Manually ran revision 8f86b420 and failed .
    nothing changed in the database
  • Manually ran revision 3876f00a and failed .
    nothing changed in the database
  • Manually ran revision 3876f00a and failed .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history