Brandhunt / Brandhunt_produpdate_scraper_phantomjs_py_module_3

Collects product information to be updated from product URLs stored at Brandhunt.se - Only products that require usage of headless browser! Module scraper 3 for centrailzed main scraper


This is a scraper that runs on Morph. To get started see the documentation

Contributors Brandhunt

Last run failed with status code 1.

Console output of last run

Injecting configuration and compiling...  -----> Python app detected  ! Python has released a security update! Please consider upgrading to python-3.6.8  Learn More: https://devcenter.heroku.com/articles/python-runtimes -----> Installing python-3.6.2 -----> Installing pip -----> Installing SQLite3 -----> Installing requirements with pip  Collecting translate==3.5.0 (from -r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/85/b2/2ea329a07bbc0c7227eef84ca89ffd6895e7ec237d6c0b26574d56103e53/translate-3.5.0-py2.py3-none-any.whl  Collecting scraperwiki==0.5.1 (from -r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/30/84/d874847baad89f03e6984fcd87505a37bf924b66519d1e07bf76e2369af0/scraperwiki-0.5.1.tar.gz  Collecting python-slugify>=4.0.0 (from -r /tmp/build/requirements.txt (line 11))  Downloading https://files.pythonhosted.org/packages/c1/35/74ab800f1108b95ff9b8e7672a01dbf1f357159e6d06c1f16e983674ff0c/python_slugify-6.1.2-py2.py3-none-any.whl  Collecting selenium==3.141.0 (from -r /tmp/build/requirements.txt (line 12))  Downloading https://files.pythonhosted.org/packages/80/d6/4294f0b4bce4de0abf13e17190289f9d0613b0a44e5dd6a7f5ca98459853/selenium-3.141.0-py2.py3-none-any.whl (904kB)  Collecting splinter~=0.10.0 (from -r /tmp/build/requirements.txt (line 14))  Downloading https://files.pythonhosted.org/packages/6e/86/fe3b6771846165ce8dc88996aa8e0846d0e6839e7c5a74f4e34ba30e1019/splinter-0.10.0.tar.gz  Collecting lxml~=4.3.3 (from -r /tmp/build/requirements.txt (line 15))  Downloading https://files.pythonhosted.org/packages/9b/d6/966660c441da8e66850f90424c7509a3186a37142c4e1a717d53c9adda96/lxml-4.3.5-cp36-cp36m-manylinux1_x86_64.whl (5.7MB)  Collecting cssselect~=1.0.0 (from -r /tmp/build/requirements.txt (line 16))  Downloading https://files.pythonhosted.org/packages/7b/44/25b7283e50585f0b4156960691d951b05d061abf4a714078393e51929b30/cssselect-1.0.3-py2.py3-none-any.whl  Collecting pre-commit (from translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/d6/a0/9c06353771c8dae6db437dd513a885eccdb1566cb332569130484eddf4e7/pre_commit-2.17.0-py2.py3-none-any.whl (195kB)  Collecting requests (from translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/2d/61/08076519c80041bc0ffa1a8af0cbd3bf3e2b62af10435d269a9d0f40564d/requests-2.27.1-py2.py3-none-any.whl (63kB)  Collecting tox (from translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/ac/6a/6f97900d7e04c60d4bed183666d10be180c63a6a9b3765b91481da96d2fe/tox-3.25.0-py2.py3-none-any.whl (85kB)  Collecting click (from translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/4a/a8/0b2ced25639fb20cc1c9784de90a8c25f9504a7f18cd8b5397bd61696d7d/click-8.0.4-py3-none-any.whl (97kB)  Collecting six (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl  Collecting sqlalchemy (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/eb/b6/b8579f5a39712fee884db2bdb9e726437b0cc2f2cb57430613651282f3eb/SQLAlchemy-1.4.39-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6MB)  Collecting alembic (from scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/b3/e2/8d48220731b7279911c43e95cd182961a703b939de6822b00de3ea0d3159/alembic-1.7.7-py3-none-any.whl (210kB)  Collecting text-unidecode>=1.3 (from python-slugify>=4.0.0->-r /tmp/build/requirements.txt (line 11))  Downloading https://files.pythonhosted.org/packages/a6/a5/c0b6468d3824fe3fde30dbb5e1f687b291608f9473681bbf7dabbf5a87d7/text_unidecode-1.3-py2.py3-none-any.whl (78kB)  Collecting urllib3 (from selenium==3.141.0->-r /tmp/build/requirements.txt (line 12))  Downloading https://files.pythonhosted.org/packages/ec/03/062e6444ce4baf1eac17a6a0ebfe36bb1ad05e1df0e20b110de59c278498/urllib3-1.26.9-py2.py3-none-any.whl (138kB)  Collecting nodeenv>=0.11.1 (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/54/73/56c89b343befb9c63e8117294d265458f0ff726fa2abcdc6bb5ec5e66a1a/nodeenv-1.6.0-py2.py3-none-any.whl  Collecting importlib-metadata; python_version < "3.8" (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/a0/a1/b153a0a4caf7a7e3f15c2cd56c7702e2cf3d89b1b359d1f1c5e59d68f4ce/importlib_metadata-4.8.3-py3-none-any.whl  Collecting importlib-resources<5.3; python_version < "3.7" (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/11/8e/84a6a778a1160cefcef1192a7bd26e4e6689981553aff13c2b2b6f1c352f/importlib_resources-5.2.3-py3-none-any.whl  Collecting pyyaml>=5.1 (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/b3/85/79b9e5b4e8d3c0ac657f4e8617713cca8408f6cdc65d2ee6554217cedff1/PyYAML-6.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (603kB)  Collecting toml (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/44/6f/7120676b6d73228c96e17f1f794d8ab046fc910d781c8d151120c3f1569e/toml-0.10.2-py2.py3-none-any.whl  Collecting identify>=1.0.0 (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/da/1a/93ac674fee1a5af11bdbc1cd895895a8710aa49402558bf91ec3523f0214/identify-2.4.4-py2.py3-none-any.whl (98kB)  Collecting cfgv>=2.0.0 (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/6d/82/0a0ebd35bae9981dea55c06f8e6aaf44a49171ad798795c72c6f64cba4c2/cfgv-3.3.1-py2.py3-none-any.whl  Collecting virtualenv>=20.0.8 (from pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/0a/d7/8f8f84aa834d9afde02055b6d11e0a0f2f35435b2ccf1a1aca4cf9046105/virtualenv-20.15.0-py2.py3-none-any.whl (10.1MB)  Collecting charset-normalizer~=2.0.0; python_version >= "3" (from requests->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/06/b3/24afc8868eba069a7f03650ac750a778862dc34941a4bebeb58706715726/charset_normalizer-2.0.12-py3-none-any.whl  Collecting certifi>=2017.4.17 (from requests->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/e9/06/d3d367b7af6305b16f0d28ae2aaeb86154fa91f144f036c2d5002a5a202b/certifi-2022.6.15-py3-none-any.whl (160kB)  Collecting idna<4,>=2.5; python_version >= "3" (from requests->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/04/a2/d918dcd22354d8958fe113e1a3630137e0fc8b44859ade3063982eacd2a4/idna-3.3-py3-none-any.whl (61kB)  Collecting filelock>=3.0.0 (from tox->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/84/ce/8916d10ef537f3f3b046843255f9799504aa41862bfa87844b9bdc5361cd/filelock-3.4.1-py3-none-any.whl  Collecting py>=1.4.17 (from tox->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/f6/f0/10642828a8dfb741e5f3fbaac830550a518a775c7fff6f04a007259b0548/py-1.11.0-py2.py3-none-any.whl (98kB)  Collecting packaging>=14 (from tox->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/05/8e/8de486cbd03baba4deef4142bd643a3e7bbe954a784dc1bb17142572d127/packaging-21.3-py3-none-any.whl (40kB)  Collecting pluggy>=0.12.0 (from tox->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/9e/01/f38e2ff29715251cf25532b9082a1589ab7e4f571ced434f98d0139336dc/pluggy-1.0.0-py2.py3-none-any.whl  Collecting greenlet!=0.4.17; python_version >= "3" and (platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32")))))) (from sqlalchemy->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/76/5a/a6a693096353c1c17932b21ae864a0280e420fadd2f14399a00b085d3d1b/greenlet-1.1.2-cp36-cp36m-manylinux1_x86_64.whl (162kB)  Collecting Mako (from alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/b4/4d/e03d08f16ee10e688bde9016bc80af8b78c7f36a8b37c7194da48f72207e/Mako-1.1.6-py2.py3-none-any.whl (75kB)  Collecting zipp>=0.5 (from importlib-metadata; python_version < "3.8"->pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/bd/df/d4a4974a3e3957fd1c1fa3082366d7fff6e428ddb55f074bf64876f8e8ad/zipp-3.6.0-py3-none-any.whl  Collecting typing-extensions>=3.6.4; python_version < "3.8" (from importlib-metadata; python_version < "3.8"->pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/45/6b/44f7f8f1e110027cf88956b59f2fad776cca7e1704396d043f89effd3a0e/typing_extensions-4.1.1-py3-none-any.whl  Collecting platformdirs<3,>=2 (from virtualenv>=20.0.8->pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/b1/78/dcfd84d3aabd46a9c77260fb47ea5d244806e4daef83aa6fe5d83adb182c/platformdirs-2.4.0-py3-none-any.whl  Collecting distlib<1,>=0.3.1 (from virtualenv>=20.0.8->pre-commit->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/ac/a3/8ee4f54d5f12e16eeeda6b7df3dfdbda24e6cc572c86ff959a4ce110391b/distlib-0.3.4-py2.py3-none-any.whl (461kB)  Collecting pyparsing!=3.0.5,>=2.0.2 (from packaging>=14->tox->translate==3.5.0->-r /tmp/build/requirements.txt (line 9))  Downloading https://files.pythonhosted.org/packages/80/c1/23fd82ad3121656b585351aba6c19761926bb0db2ebed9e4ff09a43a3fcc/pyparsing-3.0.7-py3-none-any.whl (98kB)  Collecting MarkupSafe>=0.9.2 (from Mako->alembic->scraperwiki==0.5.1->-r /tmp/build/requirements.txt (line 10))  Downloading https://files.pythonhosted.org/packages/fc/d6/57f9a97e56447a1e340f8574836d3b636e2c14de304943836bd645fa9c7e/MarkupSafe-2.0.1-cp36-cp36m-manylinux1_x86_64.whl  Installing collected packages: nodeenv, zipp, typing-extensions, importlib-metadata, importlib-resources, pyyaml, toml, identify, cfgv, platformdirs, six, distlib, filelock, virtualenv, pre-commit, lxml, charset-normalizer, certifi, idna, urllib3, requests, py, pyparsing, packaging, pluggy, tox, click, translate, greenlet, sqlalchemy, MarkupSafe, Mako, alembic, scraperwiki, text-unidecode, python-slugify, selenium, splinter, cssselect  Running setup.py install for scraperwiki: started  Running setup.py install for scraperwiki: finished with status 'done'  Running setup.py install for splinter: started  Running setup.py install for splinter: finished with status 'done'  Successfully installed Mako-1.1.6 MarkupSafe-2.0.1 alembic-1.7.7 certifi-2022.6.15 cfgv-3.3.1 charset-normalizer-2.0.12 click-8.0.4 cssselect-1.0.3 distlib-0.3.4 filelock-3.4.1 greenlet-1.1.2 identify-2.4.4 idna-3.3 importlib-metadata-4.8.3 importlib-resources-5.2.3 lxml-4.3.5 nodeenv-1.6.0 packaging-21.3 platformdirs-2.4.0 pluggy-1.0.0 pre-commit-2.17.0 py-1.11.0 pyparsing-3.0.7 python-slugify-6.1.2 pyyaml-6.0 requests-2.27.1 scraperwiki-0.5.1 selenium-3.141.0 six-1.16.0 splinter-0.10.0 sqlalchemy-1.4.39 text-unidecode-1.3 toml-0.10.2 tox-3.25.0 translate-3.5.0 typing-extensions-4.1.1 urllib3-1.26.9 virtualenv-20.15.0 zipp-3.6.0   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... Traceback (most recent call last): File "scraper.py", line 33, in <module> exec('mainfunc(' + str(max_prods) + ')', helper.__dict__) File "<string>", line 1, in <module> File "<string>", line 207, in mainfunc File "/app/.heroku/python/lib/python3.6/json/__init__.py", line 354, in loads return _default_decoder.decode(s) File "/app/.heroku/python/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/app/.heroku/python/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Data

Downloaded 60292 times by Brandhunt

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (3.1 MB) Use the API

rows 10 / 967

productid url domain price salesprice domainmisc prodlogurls prodlogurl finalimgurls validimgurls imgurls notfound notavailable removeon404 soldoutfix soldouthtmlfix catstoaddresult attributes sizetypemapsqls
28074
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 141, "name": "Male", "slug": "male", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
28310
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 141, "name": "Male", "slug": "male", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
28736
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
28892
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
29166
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
29170
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
30774
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
30820
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 142, "name": "Female", "slug": "female", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
27994
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 141, "name": "Male", "slug": "male", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]
28002
pauw.com
1
""
""
""
""
""
true
false
true
false
false
""
[{"name": "Brand", "options": [[{"term_id": 918, "name": "Pauw", "slug": "brand-pauw", "taxonomy": "pa_brand"}, false]], "position": 1, "visible": 1, "variation": 1}, {"name": "Sex", "options": [[{"term_id": 141, "name": "Male", "slug": "male", "taxonomy": "pa_sex"}, false]], "position": 2, "visible": 1, "variation": 1}]
["", "", "", ""]

Statistics

Average successful run time: about 2 hours

Total run time: about 1 month

Total cpu time used: 4 days

Total disk space used: 3.12 MB

History

  • Auto ran revision 8077995e and failed .
    nothing changed in the database
  • Auto ran revision 8077995e and failed .
    nothing changed in the database
  • Auto ran revision 8077995e and failed .
    nothing changed in the database
  • Auto ran revision 8077995e and failed .
    nothing changed in the database
  • Auto ran revision 8077995e and failed .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history