CodeForAfrica-SCRAPERS / healthtools_ng

[morph] A scraper for the Nigeria HealthTools.


Healthtools Nigeria

This is a suite of healthtools for Code for Nigeria. It contains a scraper that acquires medicine brand names from http://rxnigeria.com/en/items. The scraper currently runs on morph.io.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Installing

Clone the repo from Github by running $ git clone git@github.com:CodeForAfrica-SCRAPERS/healthtools_ng.git

Change directory into package $ cd healthtools_ng

Install the dependencies by running $ pip install requirements.txt

You can set the required environment variables like so $ export MORPH_AWS_ACCESS_KEY_ID= <aws_access_key_id> $ export MORPH_AWS_SECRET_KEY= <aws_secret_key>

You can now run the scrapers $ python scraper.py

Running the tests

Use nosetests to run tests (with stdout) like this: $ nosetests --nocapture

Contributors DavidLemayian RyanSept

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.9, which is unsupported).  ! We recommend upgrading by specifying the latest version (python-2.7.14).  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-2.7.9 -----> Installing pip -----> Installing requirements with pip  DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.  Collecting appdirs==1.4.3  Downloading appdirs-1.4.3-py2.py3-none-any.whl (12 kB)  Collecting beautifulsoup4==4.6.0  Downloading beautifulsoup4-4.6.0-py2-none-any.whl (86 kB)  Collecting boto3==1.4.4  Downloading boto3-1.4.4-py2.py3-none-any.whl (127 kB)  Collecting botocore==1.5.46  Downloading botocore-1.5.46-py2.py3-none-any.whl (3.5 MB)  Collecting bs4==0.0.1  Downloading bs4-0.0.1.tar.gz (1.1 kB)  Collecting docutils==0.13.1  Downloading docutils-0.13.1-py2-none-any.whl (537 kB)  WARNING: The candidate selected for download or install is a yanked version: 'futures' candidate (version 3.1.1 at https://files.pythonhosted.org/packages/a6/1c/72a18c8c7502ee1b38a604a5c5243aa8c2a64f4bba4e6631b1b8972235dd/futures-3.1.1-py2-none-any.whl#sha256=c4884a65654a7c45435063e14ae85280eb1f111d94e542396717ba9828c4337f (from https://pypi.org/simple/futures/))  Reason for being yanked: Does not declare incompatibility with Python 3  Collecting futures==3.1.1  Downloading futures-3.1.1-py2-none-any.whl (14 kB)  Collecting jmespath==0.9.2  Downloading jmespath-0.9.2-py2.py3-none-any.whl (23 kB)  Collecting packaging==16.8  Downloading packaging-16.8-py2.py3-none-any.whl (23 kB)  Collecting pyparsing==2.2.0  Downloading pyparsing-2.2.0-py2.py3-none-any.whl (56 kB)  Collecting python-dateutil==2.6.0  Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194 kB)  Collecting requests==2.13.0  Downloading requests-2.13.0-py2.py3-none-any.whl (584 kB)  Collecting s3transfer==0.1.10  Downloading s3transfer-0.1.10-py2.py3-none-any.whl (54 kB)  Collecting six==1.10.0  Downloading six-1.10.0-py2.py3-none-any.whl (10 kB)  Building wheels for collected packages: bs4  Building wheel for bs4 (setup.py): started  Building wheel for bs4 (setup.py): finished with status 'done'  Created wheel for bs4: filename=bs4-0.0.1-py2-none-any.whl size=1273 sha256=55c63fbbcf3152a5f323629f3fdad59e45687ea5648f05ba33bc518d8d180cd1  Stored in directory: /tmp/pip-ephem-wheel-cache-WMVx7j/wheels/98/b9/dc/90f1e36fc6bf9564491a69c9c3d7ae38b8f72986256e416be6  Successfully built bs4  Installing collected packages: appdirs, beautifulsoup4, docutils, jmespath, six, python-dateutil, botocore, futures, s3transfer, boto3, bs4, pyparsing, packaging, requests  Successfully installed appdirs-1.4.3 beautifulsoup4-4.6.0 boto3-1.4.4 botocore-1.5.46 bs4-0.0.1 docutils-0.13.1 futures-3.1.1 jmespath-0.9.2 packaging-16.8 pyparsing-2.2.0 python-dateutil-2.6.0 requests-2.13.0 s3transfer-0.1.10 six-1.10.0 DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.    -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... ERROR: get_total_page_numbers() - url: http://rxnigeria.com/en/items?start={} - err: 'NoneType' object has no attribute 'find' Running MedBrandNamesScraper | MedBrandNamesScraper completed. | 0 entries retrieved. | 0 pages skipped.

Statistics

Average successful run time: less than a minute

Total run time: 1 day

Total cpu time used: 38 minutes

Total disk space used: 807 KB

History

  • Auto ran revision d83154d4 and completed successfully .
    nothing changed in the database
  • Auto ran revision d83154d4 and completed successfully .
    nothing changed in the database
  • Auto ran revision d83154d4 and completed successfully .
    nothing changed in the database
  • Auto ran revision d83154d4 and completed successfully .
    nothing changed in the database
  • Auto ran revision d83154d4 and completed successfully .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

healthtools_ng / scraper.py