andrewbrazzatti / global_research_identifier_database

The Global Research Identifier Database

Scrapes grid.ac, dx.doi.org, figshare.com, and 2 other domains

GRID - Global Research Identifier Database


Contributors andrewbrazzatti

Last run completed successfully .

Console output of last run

Injecting configuration and compiling... [1G [1G-----> Python app detected [1G ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.9, which is unsupported). [1G ! We recommend upgrading by specifying the latest version (python-2.7.14). [1G Learn More: https://devcenter.heroku.com/articles/python-runtimes [1G-----> Installing python-2.7.9 [1G-----> Installing pip [1G-----> Installing requirements with pip [1G Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 6)) [1G Cloning http://github.com/openaustralia/scraperwiki-python.git (to revision morph_defaults) to /app/.heroku/src/scraperwiki [1G Switched to a new branch 'morph_defaults' [1G Branch morph_defaults set up to track remote branch morph_defaults from origin. [1G Collecting beautifulsoup4==4.6.0 (from -r /tmp/build/requirements.txt (line 7)) [1G Downloading https://files.pythonhosted.org/packages/a6/29/bcbd41a916ad3faf517780a0af7d0254e8d6722ff6414723eedba4334531/beautifulsoup4-4.6.0-py2-none-any.whl (86kB) [1G Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/15/27/3330a343de80d6849545b6c7723f8c9a08b4b104de964ac366e7e6b318df/dumptruck-0.1.6.tar.gz [1G Collecting requests (from scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/f1/ca/10332a30cb25b627192b4ea272c351bce3ca1091e541245cccbace6051d8/requests-2.20.0-py2.py3-none-any.whl (60kB) [1G Collecting idna<2.8,>=2.5 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB) [1G Collecting certifi>=2017.4.17 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/56/9d/1d02dd80bc4cd955f98980f28c5ee2200e1209292d5f9e9cc8d030d18655/certifi-2018.10.15-py2.py3-none-any.whl (146kB) [1G Collecting urllib3<1.25,>=1.21.1 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl (118kB) [1G Collecting chardet<3.1.0,>=3.0.2 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6)) [1G Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB) [1G Installing collected packages: dumptruck, idna, certifi, urllib3, chardet, requests, scraperwiki, beautifulsoup4 [1G Running setup.py install for dumptruck: started [1G Running setup.py install for dumptruck: finished with status 'done' [1G Running setup.py develop for scraperwiki [1G Successfully installed beautifulsoup4-4.6.0 certifi-2018.10.15 chardet-3.0.4 dumptruck-0.1.6 idna-2.7 requests-2.20.0 scraperwiki urllib3-1.24.1 [1G [1G [1G-----> Discovering process types [1G Procfile declares types -> scraper Injecting scraper and running...

Data

Downloaded 89 times by redbox-mint shilob andrewbrazzatti

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (8.9 MB) Use the API

rows 10 / 89785

name email_address grid_id wikipedia_url established
name
National University of Benin
email_address
grid_id
grid.4268.8
wikipedia_url
established
name
Joint Research Centre
email_address
grid_id
grid.5368.8
wikipedia_url established
1961
name
Pierre and Marie Curie University
email_address
grid_id
grid.5805.8
wikipedia_url established
1971
name
University of Auvergne
email_address
grid_id
grid.7903.d
wikipedia_url established
1519
name
Blaise Pascal University
email_address
grid_id
grid.7907.9
wikipedia_url established
1854
name
Institute for Prospective Technological Studies
email_address
grid_id
grid.9238.0
wikipedia_url established
name
National Institute for Medical Research
email_address
grid_id
grid.16813.3d
wikipedia_url established
1913
name
Hebrew Rehabilitation Center For Aged
email_address
grid_id
grid.17867.3f
wikipedia_url
established
1903
name
Universite Rennes 2-Haute Bretagne
email_address
grid_id
grid.21395.3c
wikipedia_url
established
name
University of Paris-Sorbonne
email_address
grid_id
grid.46900.3b
wikipedia_url established
1970

Statistics

Average successful run time: 4 minutes

Total run time: 20 minutes

Total cpu time used: 7 minutes

Total disk space used: 8.92 MB

History

  • Manually ran revision 88ddc1fa and completed successfully .
    89506 records added, 74244 records removed in the database
    5 pages scraped
  • Manually ran revision c5f03168 and completed successfully .
    74523 records added, 74523 records removed in the database
  • Manually ran revision 95122b18 and completed successfully .
    74523 records added in the database
  • Manually ran revision 95122b18 and completed successfully .
    nothing changed in the database
  • Manually ran revision 89a2ed95 and completed successfully .
    74524 records added in the database
  • Created on morph.io