Work in progress...


system dependencies

sudo apt-get install python-dev

with virtualenv

obtain virtualenv

Check or follow these instructions:

if Debian equal/newer than jessie (virtualenv version equal or greater than 1.9)
sudo apt-get install python-virtualenv
if Debian older than jessie (or virtualenv version prior to 1.9)
sudo apt-get install ca-certificates gnupg
curl > /pathtovirtualenvdownload/virtualenv-13.1.0.tar.gz # or latest
curl > /pathtovirtualenvdownload/virtualenv-13.1.0.tar.gz.asc # or latest
mkdir /tmp/.gnupg
chmod 700 /tmp/.gnupg
gpg --homedir /tmp/.gnupg --keyserver --recv-keys 3372DCFA
gpg --homedir /tmp/.gnupg --fingerprint 3372DCFA # check is 7C6B 7C5D 5E2B 6356 A926  F04F 6E3C BCE9 3372 DCFA
gpg --homedir /tmp/.gnupg --verify /pathtovirtualenvdownload/virtualenv-13.1.0.tar.gz.asc
tar xzf /pathtovirtualenvdownload/virtualenv-13.1.0.tar.gz --directory /pathtovirtualenvbin/
echo "alias virtualenv='python  /pathtovirtualenvbin/virtualenv-13.1.0/'" >> ~/.bashrc # or other shell start
source ~/.bashrc # or other shell start

create a virtualenv

mkdir ~/.virtualenvs
virtualenv ~/.virtualenvs/oiienv
source ~/.virtualenvs/oiienv/bin/activate

install dependencies in virtualenv

git clone
cd page_watcher_scraper
pip install -r requirements.txt


More about pagewatcherscraper/pagewatcherscraper/ TBD More about pagewatcherscraper/ TBD If you need local settings, edit pagewatcherscraper/pagewatcherscraper/ and pagewatcherscraper/


To list the scrappers:

scrapy list

To run pagewatcherscraper:

cd page_watcher_scraper

configuring cron job

Create an script like this replacing the path by your path:

cd page_watcher_scraper

cd /mypath/page_watcher_scraper && source /mypath/page_watcher_scraper/environment && source /mypath/.virtualenvs/oiienv/bin/activate && /mypath/.virtualenvs/oiienv/bin/python /mypath/page_watcher_scraper/ >> /mypath/page_watcher_scraper/cronlog.txt

Edit crontab:

crontab -e

To run every day at 14:35h:

35 14    * * * /bin/bash /home/duy/page_watcher_scraper/

Contributors juga0

Last run failed with status code 255.

Console output of last run

Injecting configuration and compiling... [1G [1G-----> Python app detected [1G ! The latest version of Python 2 is python-2.7.14 (you are using python-2.7.6, which is unsupported). [1G ! We recommend upgrading by specifying the latest version (python-2.7.14). [1G Learn More: [1G-----> Installing python-2.7.6 [1G ! Requested runtime (python-2.7.6) is not available for this stack (cedar-14). [1G ! Aborting. More info:


Average successful run time: 3 minutes

Total run time: about 1 month

Total cpu time used: about 4 hours

Total disk space used: 131 KB


  • Auto ran revision 7d5f437c and failed .
    nothing changed in the database
  • Manually ran revision 7d5f437c and failed .
    nothing changed in the database
  • Auto ran and failed .
  • Auto ran and failed .
  • Auto ran and failed .
  • ...
  • Created on

Show complete history

Scraper code


page_watcher_scraper /