andrewbrazzatti / global_research_identifier_database

The Global Research Identifier Database

Scrapes grid.ac, dx.doi.org, figshare.com, and 2 other domains

GRID - Global Research Identifier Database


Contributors andrewbrazzatti

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.9, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.9 -----> Installing pip -----> Installing requirements with pip  Obtaining scraperwiki from git+http://github.com/openaustralia/scraperwiki-python.git@morph_defaults#egg=scraperwiki (from -r /tmp/build/requirements.txt (line 6))  Cloning http://github.com/openaustralia/scraperwiki-python.git (to revision morph_defaults) to /app/.heroku/src/scraperwiki  Switched to a new branch 'morph_defaults'  Branch morph_defaults set up to track remote branch morph_defaults from origin.  Collecting beautifulsoup4==4.6.0 (from -r /tmp/build/requirements.txt (line 7))  Downloading https://files.pythonhosted.org/packages/a6/29/bcbd41a916ad3faf517780a0af7d0254e8d6722ff6414723eedba4334531/beautifulsoup4-4.6.0-py2-none-any.whl (86kB)  Collecting dumptruck>=0.1.2 (from scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/15/27/3330a343de80d6849545b6c7723f8c9a08b4b104de964ac366e7e6b318df/dumptruck-0.1.6.tar.gz  Collecting requests (from scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/f1/ca/10332a30cb25b627192b4ea272c351bce3ca1091e541245cccbace6051d8/requests-2.20.0-py2.py3-none-any.whl (60kB)  Collecting idna<2.8,>=2.5 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB)  Collecting certifi>=2017.4.17 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/56/9d/1d02dd80bc4cd955f98980f28c5ee2200e1209292d5f9e9cc8d030d18655/certifi-2018.10.15-py2.py3-none-any.whl (146kB)  Collecting urllib3<1.25,>=1.21.1 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl (118kB)  Collecting chardet<3.1.0,>=3.0.2 (from requests->scraperwiki->-r /tmp/build/requirements.txt (line 6))  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)  Installing collected packages: dumptruck, idna, certifi, urllib3, chardet, requests, scraperwiki, beautifulsoup4  Running setup.py install for dumptruck: started  Running setup.py install for dumptruck: finished with status 'done'  Running setup.py develop for scraperwiki  Successfully installed beautifulsoup4-4.6.0 certifi-2018.10.15 chardet-3.0.4 dumptruck-0.1.6 idna-2.7 requests-2.20.0 scraperwiki urllib3-1.24.1   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running...

Data

Downloaded 88 times by redbox-mint shilob andrewbrazzatti

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (8.9 MB) Use the API

rows 10 / 89785

name email_address grid_id wikipedia_url established
National University of Benin
grid.4268.8
Joint Research Centre
grid.5368.8
1961
Pierre and Marie Curie University
grid.5805.8
1971
University of Auvergne
grid.7903.d
1519
Blaise Pascal University
grid.7907.9
1854
Institute for Prospective Technological Studies
grid.9238.0
National Institute for Medical Research
grid.16813.3d
1913
Hebrew Rehabilitation Center For Aged
grid.17867.3f
1903
Universite Rennes 2-Haute Bretagne
grid.21395.3c
University of Paris-Sorbonne
grid.46900.3b
1970

Statistics

Average successful run time: 4 minutes

Total run time: 20 minutes

Total cpu time used: 7 minutes

Total disk space used: 8.92 MB

History

  • Manually ran revision 88ddc1fa and completed successfully .
    89506 records added, 74244 records removed in the database
    5 pages scraped
  • Manually ran revision c5f03168 and completed successfully .
    74523 records added, 74523 records removed in the database
  • Manually ran revision 95122b18 and completed successfully .
    74523 records added in the database
  • Manually ran revision 95122b18 and completed successfully .
    nothing changed in the database
  • Manually ran revision 89a2ed95 and completed successfully .
    74524 records added in the database
  • Created on morph.io