amoghds / Test_1

Basic Twitter Scraper


import scraperwiki
+import simplejson
+import urllib2
+
+# Change QUERY to your search term of choice.
+# Examples: ‘newsnight’, ‘from:bbcnewsnight’, ‘to:bbcnewsnight’
+QUERY = ‘#NaMo’
+RESULTS_PER_PAGE = ‘100’
+LANGUAGE = ‘en’
+NUM_PAGES = 1000
+
+for page in range(1, NUM_PAGES
1):
+ base_url = ‘http://search.twitter.com/search.json?q=%s&rpp=%s&lang=%s&page=%s’ \
+ % (urllib2.quote(QUERY), RESULTS_PER_PAGE, LANGUAGE, page)
+ try:
+ results_json = simplejson.loads(scraperwiki.scrape(base_url))
+ for result in results_json[‘results’]:
+ #print result
+ data = {}
+ data[‘id’] = result[‘id’]
+ data[‘text’] = result[‘text’]
+ data[‘from_user’] = result[‘from_user’]
+ data[‘created_at’] = result[‘created_at’]
+ print data[‘from_user’], data[‘text’]
+ scraperwiki.sqlite.save([“id”], data)
+ except:
+ print ‘Oh dear, failed to scrape %s’ % base_url
+ break

Forked from ScraperWiki

Contributors amoghds

This scraper has not yet been run

Data

Downloaded 1 time by MikeRalphson

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (0 Bytes) Use the API

rows 10 / 216538

text id from_user created_at
first @fixmystreet fr @mysociety now @FixMyTransport http://t.co/9re4iom we need real @open311 in #seattle #opengov #opendata #transit
108990595585949696
MrDataFerret
@Gov Walker of WI can't be pleased w poor audit of state contract sunshine site http://t.co/pByo2g1 #opengov #opendata #wisconsin
108986525211037696
MrDataFerret
" #OpenData Geeks to the Rescue" http://t.co/osK0iNu > nice work by @dbhume & @data_bc team. good step 2ward gr8r #opengov #dbchack
108984924614307841
asterix
RT @chilobbyists: Official statement from Rahm's office on the release of more lobbyist data and our collaboration: http://t.co/ENpUMjY #opengov #opendata
108982291694493696
billgatewood
RT @WorldBank: Where should we take #opendata next? Tell us. Live 9/13 1800 GMT http://ow.ly/6hCvz
108980355138199552
Mitalikm
RT @WorldBank: Where should we take #opendata next? Tell us. Live 9/13 1800 GMT http://ow.ly/6hCvz
108976950365798400
abmakulec
RT @chilobbyists: Official statement from Rahm's office on the release of more lobbyist data and our collaboration: http://t.co/ENpUMjY #opengov #opendata
108976121231577088
rougeux
RT @WorldBank: Where should we take #opendata next? Tell us. Live 9/13 1800 GMT http://ow.ly/6hCvz
108976005695283200
avilarenata
RT @vinsumner: #AmsterdamSmartCity add your information http://t.co/1z5SzDI about the city , #opendata , #picnic11
108975904633524224
spolliaro
RT @worldbank: Where should we take #opendata next? Tell us. Live 9/13 1800 GMT http://t.co/h9PV2iO
108975758717886464
MGPSI

Statistics

Total run time: less than 5 seconds

Total cpu time used: less than 5 seconds

Total disk space used: 20.3 KB

History

Scraper code

Python

Test_1 / scraper.py