okfn / opentrials-clinicaltrailsgov-data

Clinical trial data from clinicaltrials.gov


opentrials-clinicaltrailsgov-data

This is a scraper that runs on Morph. To get started see the documentation.

Documents

Configuration

Environment variables to configure the scraper:

  • DATE_FROM

Get trials with last updated mark => this date.

  • DATE_TO

Get trials with last updated mark <= this date.

  • DOWNLOAD_DELAY

Requests we're making to clinicaltrials.gov interval.

Workflow

We need to download around 200 000 pages and we want to be polite to the source webserver:

  • 1s delay -> 60 hours
  • 0.5s delay -> 30 hours (more than morph.io can do for us)
  • etc

So we can scrape manually by years then pull updates for the last year automatically:

  • 1 year (60 000 pages) + 1s delay -> 32 hours
  • 1 year (60 000 pages) + 0.5s delay -> 16 hours
  • 1 year (60 000 pages) + 0.25s delay -> 8 hours
  • etc

Proposed settings for the scraping some year:

DATE_FROM='2010-01-01' DATE_TO='2010-12-31' DOWNLOAD_DELAY='0.25'

Proposed settings for the database updating:

DATE_FROM='2015-10-01' DATE_TO='2015-12-31' DOWNLOAD_DELAY='0.5'

Legal notes

Source of all data scraped by this scraper: clinicaltrials.gov.

ClinicalTrials.gov data are available to all requesters, both within and outside the United States, at no charge.

Terms and conditions

Last run failed with status code 255.

Console output of last run

Injecting configuration and compiling...

Statistics

Average successful run time: about 5 hours

Total run time: 2 months

Total cpu time used: about 20 hours

Total disk space used: 69.8 KB

History

  • Auto ran revision 111c5256 and failed .
    nothing changed in the database
  • Auto ran revision 111c5256 and failed .
    nothing changed in the database
  • Auto ran revision 111c5256 and completed successfully .
    nothing changed in the database
  • Auto ran revision 111c5256 and completed successfully .
    nothing changed in the database
  • Auto ran revision 111c5256 and completed successfully .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history