balling / HongKongCharities

The list of charitable institutions and trusts of a public character, which are exempt from tax under section 88 of the Inland Revenue Ordinance


This scraper is running periodically on Morph. Visit the page to download the data set.

Source

The current complete list of charitable Institutions & trusts of a public character, which are exempt from tax under section 88 of the Inland Revenue Ordinance is available on IRD website as: * search page - no unique identifier for each charity for change tracking - no subsidiaries information * pdf - no unique identifier for each charity for change tracking - non machine readable (pdftotxt.py attempts to convert the pdf into txt, however the formatting of the pdf is too irregular for extracting structured information reliably)

Todos

  • [ ] obtain identifier for each subsidiary (e.g. S000002 for Ricci Hall (H.K.U.))
  • [ ] add last update time for subsidiaries
  • [ ] add status (active/inactive) for each charity based on last update time

Contributors balling

Last run completed successfully .

Console output of last run

Injecting configuration and compiling...  -----> Python app detected -----> Installing python-3.6.2 -----> Installing pip -----> Installing requirements with pip  Collecting Scrapy>=1.5  Downloading Scrapy-2.6.1-py2.py3-none-any.whl (264 kB)  Collecting scraperwiki>=0.5.1  Downloading scraperwiki-0.5.1.tar.gz (7.7 kB)  Preparing metadata (setup.py): started  Preparing metadata (setup.py): finished with status 'done'  Collecting queuelib>=1.4.2  Downloading queuelib-1.6.2-py2.py3-none-any.whl (13 kB)  Collecting pyOpenSSL>=16.2.0  Downloading pyOpenSSL-22.0.0-py2.py3-none-any.whl (55 kB)  Collecting lxml>=3.5.0  Downloading lxml-4.8.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (6.4 MB)  Collecting tldextract  Downloading tldextract-3.1.2-py2.py3-none-any.whl (87 kB)  Collecting w3lib>=1.17.0  Downloading w3lib-1.22.0-py2.py3-none-any.whl (20 kB)  Collecting cssselect>=0.9.1  Downloading cssselect-1.1.0-py2.py3-none-any.whl (16 kB)  Collecting parsel>=1.5.0  Downloading parsel-1.6.0-py2.py3-none-any.whl (13 kB)  Collecting Twisted>=17.9.0  Downloading Twisted-21.2.0-py3-none-any.whl (3.1 MB)  Collecting itemloaders>=1.0.1  Downloading itemloaders-1.0.4-py3-none-any.whl (11 kB)  Collecting service-identity>=16.0.0  Downloading service_identity-21.1.0-py2.py3-none-any.whl (12 kB)  Collecting itemadapter>=0.1.0  Downloading itemadapter-0.6.0-py3-none-any.whl (10 kB)  Collecting zope.interface>=4.1.3  Downloading zope.interface-5.4.0-cp36-cp36m-manylinux2010_x86_64.whl (251 kB)  Collecting protego>=0.1.15  Downloading Protego-0.2.1-py2.py3-none-any.whl (8.2 kB)  Collecting PyDispatcher>=2.0.5  Downloading PyDispatcher-2.0.5.zip (47 kB)  Preparing metadata (setup.py): started  Preparing metadata (setup.py): finished with status 'done'  Collecting cryptography>=2.0  Downloading cryptography-37.0.2-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)  Collecting requests  Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)  Collecting six  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)  Collecting sqlalchemy  Downloading SQLAlchemy-1.4.36-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)  Collecting alembic  Downloading alembic-1.7.7-py3-none-any.whl (210 kB)  Collecting cffi>=1.12  Downloading cffi-1.15.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (405 kB)  Collecting jmespath>=0.9.5  Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)  Collecting attrs>=19.1.0  Downloading attrs-21.4.0-py2.py3-none-any.whl (60 kB)  Collecting pyasn1  Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)  Collecting pyasn1-modules  Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)  Collecting incremental>=16.10.1  Downloading incremental-21.3.0-py2.py3-none-any.whl (15 kB)  Collecting Automat>=0.8.0  Downloading Automat-20.2.0-py2.py3-none-any.whl (31 kB)  Collecting constantly>=15.1  Downloading constantly-15.1.0-py2.py3-none-any.whl (7.9 kB)  Collecting hyperlink>=17.1.1  Downloading hyperlink-21.0.0-py2.py3-none-any.whl (74 kB)  Collecting Mako  Downloading Mako-1.1.6-py2.py3-none-any.whl (75 kB)  Collecting importlib-resources  Downloading importlib_resources-5.4.0-py3-none-any.whl (28 kB)  Collecting importlib-metadata  Downloading importlib_metadata-4.8.3-py3-none-any.whl (17 kB)  Collecting greenlet!=0.4.17  Downloading greenlet-1.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (147 kB)  Collecting certifi>=2017.4.17  Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)  Collecting urllib3<1.27,>=1.21.1  Downloading urllib3-1.26.9-py2.py3-none-any.whl (138 kB)  Collecting idna<4,>=2.5  Downloading idna-3.3-py3-none-any.whl (61 kB)  Collecting charset-normalizer~=2.0.0  Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB)  Collecting filelock>=3.0.8  Downloading filelock-3.4.1-py3-none-any.whl (9.9 kB)  Collecting requests-file>=1.4  Downloading requests_file-1.5.1-py2.py3-none-any.whl (3.7 kB)  Collecting pycparser  Downloading pycparser-2.21-py2.py3-none-any.whl (118 kB)  Collecting typing-extensions>=3.6.4  Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)  Collecting zipp>=0.5  Downloading zipp-3.6.0-py3-none-any.whl (5.3 kB)  Collecting MarkupSafe>=0.9.2  Downloading MarkupSafe-2.0.1-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (30 kB)  Building wheels for collected packages: scraperwiki, PyDispatcher  Building wheel for scraperwiki (setup.py): started  Building wheel for scraperwiki (setup.py): finished with status 'done'  Created wheel for scraperwiki: filename=scraperwiki-0.5.1-py3-none-any.whl size=6545 sha256=2385e1b72e80077ab47376e31b78f85f037fa254985ce7376a41d03a6a77db24  Stored in directory: /tmp/pip-ephem-wheel-cache-y2m9gjhj/wheels/cd/f8/ac/cd66eb1c557ab40d35c1ed852da3e9b37baa3e21b61906a5cf  Building wheel for PyDispatcher (setup.py): started  Building wheel for PyDispatcher (setup.py): finished with status 'done'  Created wheel for PyDispatcher: filename=PyDispatcher-2.0.5-py3-none-any.whl size=11516 sha256=db89bc2e293e4451849620981118ebca732654adb7c17f20ebb132740b587092  Stored in directory: /tmp/pip-ephem-wheel-cache-y2m9gjhj/wheels/69/20/28/fbdcea83fadaf56e6e3ed24df2a1b409a8d950d3cce69bbfce  Successfully built scraperwiki PyDispatcher  Installing collected packages: zipp, urllib3, typing-extensions, six, pycparser, idna, charset-normalizer, certifi, w3lib, requests, pyasn1, MarkupSafe, lxml, importlib-metadata, greenlet, cssselect, cffi, attrs, zope.interface, sqlalchemy, requests-file, pyasn1-modules, parsel, Mako, jmespath, itemadapter, incremental, importlib-resources, hyperlink, filelock, cryptography, constantly, Automat, Twisted, tldextract, service-identity, queuelib, pyOpenSSL, PyDispatcher, protego, itemloaders, alembic, Scrapy, scraperwiki  Successfully installed Automat-20.2.0 Mako-1.1.6 MarkupSafe-2.0.1 PyDispatcher-2.0.5 Scrapy-2.6.1 Twisted-21.2.0 alembic-1.7.7 attrs-21.4.0 certifi-2021.10.8 cffi-1.15.0 charset-normalizer-2.0.12 constantly-15.1.0 cryptography-37.0.2 cssselect-1.1.0 filelock-3.4.1 greenlet-1.1.2 hyperlink-21.0.0 idna-3.3 importlib-metadata-4.8.3 importlib-resources-5.4.0 incremental-21.3.0 itemadapter-0.6.0 itemloaders-1.0.4 jmespath-0.10.0 lxml-4.8.0 parsel-1.6.0 protego-0.2.1 pyOpenSSL-22.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycparser-2.21 queuelib-1.6.2 requests-2.27.1 requests-file-1.5.1 scraperwiki-0.5.1 service-identity-21.1.0 six-1.16.0 sqlalchemy-1.4.36 tldextract-3.1.2 typing-extensions-4.1.1 urllib3-1.26.9 w3lib-1.22.0 zipp-3.6.0 zope.interface-5.4.0   -----> Discovering process types  Procfile declares types -> scraper Injecting scraper and running... /app/.heroku/python/lib/python3.6/site-packages/OpenSSL/crypto.py:8: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release. from cryptography import utils, x509 2022-05-18 11:21:35 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot) 2022-05-18 11:21:35 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.12, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.2.0, Python 3.6.2 (default, Jul 31 2017, 22:58:30) - [GCC 4.8.4], pyOpenSSL 22.0.0 (OpenSSL 3.0.3 3 May 2022), cryptography 37.0.2, Platform Linux-5.12.2-x86_64-linode144-x86_64-with-debian-jessie-sid 2022-05-18 11:21:35 [scrapy.crawler] INFO: Overridden settings: {'DOWNLOAD_DELAY': 0.2, 'LOG_LEVEL': 'INFO'} 2022-05-18 11:21:35 [scrapy.extensions.telnet] INFO: Telnet Password: c5dbf72edc7b8b70 2022-05-18 11:21:35 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.memusage.MemoryUsage', 'scrapy.extensions.logstats.LogStats'] 2022-05-18 11:21:35 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2022-05-18 11:21:35 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2022-05-18 11:21:35 [scrapy.middleware] INFO: Enabled item pipelines: [] 2022-05-18 11:21:35 [scrapy.core.engine] INFO: Spider opened 2022-05-18 11:21:36 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:21:36 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2022-05-18 11:21:36 [py.warnings] WARNING: /app/.heroku/python/lib/python3.6/site-packages/scraperwiki/sql.py:75: SAWarning: SQLite version (3, 7, 9) is older than 3.7.16, and will not support right nested joins, as are sometimes used in more complex ORM scenarios. SQLAlchemy 1.4 and above no longer tries to rewrite these joins. connect_args={'timeout': DATABASE_TIMEOUT}) 2022-05-18 11:22:36 [scrapy.extensions.logstats] INFO: Crawled 237 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:23:36 [scrapy.extensions.logstats] INFO: Crawled 478 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:24:36 [scrapy.extensions.logstats] INFO: Crawled 719 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:25:36 [scrapy.extensions.logstats] INFO: Crawled 965 pages (at 246 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:26:36 [scrapy.extensions.logstats] INFO: Crawled 1201 pages (at 236 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:27:36 [scrapy.extensions.logstats] INFO: Crawled 1438 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:28:36 [scrapy.extensions.logstats] INFO: Crawled 1677 pages (at 239 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:29:36 [scrapy.extensions.logstats] INFO: Crawled 1919 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:30:36 [scrapy.extensions.logstats] INFO: Crawled 2163 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:31:36 [scrapy.extensions.logstats] INFO: Crawled 2403 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:32:36 [scrapy.extensions.logstats] INFO: Crawled 2644 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:33:36 [scrapy.extensions.logstats] INFO: Crawled 2887 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:34:36 [scrapy.extensions.logstats] INFO: Crawled 3127 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:35:36 [scrapy.extensions.logstats] INFO: Crawled 3364 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:36:36 [scrapy.extensions.logstats] INFO: Crawled 3608 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:37:36 [scrapy.extensions.logstats] INFO: Crawled 3846 pages (at 238 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:38:36 [scrapy.extensions.logstats] INFO: Crawled 4088 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:39:36 [scrapy.extensions.logstats] INFO: Crawled 4330 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:40:36 [scrapy.extensions.logstats] INFO: Crawled 4575 pages (at 245 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:41:36 [scrapy.extensions.logstats] INFO: Crawled 4812 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:42:36 [scrapy.extensions.logstats] INFO: Crawled 5053 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:43:36 [scrapy.extensions.logstats] INFO: Crawled 5293 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:44:36 [scrapy.extensions.logstats] INFO: Crawled 5536 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:45:36 [scrapy.extensions.logstats] INFO: Crawled 5776 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:46:36 [scrapy.extensions.logstats] INFO: Crawled 6016 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:47:36 [scrapy.extensions.logstats] INFO: Crawled 6258 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:48:36 [scrapy.extensions.logstats] INFO: Crawled 6503 pages (at 245 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:49:36 [scrapy.extensions.logstats] INFO: Crawled 6750 pages (at 247 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:50:36 [scrapy.extensions.logstats] INFO: Crawled 6990 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:51:36 [scrapy.extensions.logstats] INFO: Crawled 7233 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:52:36 [scrapy.extensions.logstats] INFO: Crawled 7473 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:53:36 [scrapy.extensions.logstats] INFO: Crawled 7716 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:54:36 [scrapy.extensions.logstats] INFO: Crawled 7952 pages (at 236 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:55:36 [scrapy.extensions.logstats] INFO: Crawled 8195 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:56:36 [scrapy.extensions.logstats] INFO: Crawled 8437 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:57:36 [scrapy.extensions.logstats] INFO: Crawled 8678 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:58:36 [scrapy.extensions.logstats] INFO: Crawled 8921 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 11:59:36 [scrapy.extensions.logstats] INFO: Crawled 9163 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:00:36 [scrapy.extensions.logstats] INFO: Crawled 9406 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:01:36 [scrapy.extensions.logstats] INFO: Crawled 9655 pages (at 249 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:02:36 [scrapy.extensions.logstats] INFO: Crawled 9895 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:03:36 [scrapy.extensions.logstats] INFO: Crawled 10132 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:04:36 [scrapy.extensions.logstats] INFO: Crawled 10377 pages (at 245 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:05:36 [scrapy.extensions.logstats] INFO: Crawled 10618 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:06:36 [scrapy.extensions.logstats] INFO: Crawled 10861 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:07:36 [scrapy.extensions.logstats] INFO: Crawled 11100 pages (at 239 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:08:36 [scrapy.extensions.logstats] INFO: Crawled 11340 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:09:36 [scrapy.extensions.logstats] INFO: Crawled 11581 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:10:36 [scrapy.extensions.logstats] INFO: Crawled 11825 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:11:36 [scrapy.extensions.logstats] INFO: Crawled 12068 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:12:36 [scrapy.extensions.logstats] INFO: Crawled 12307 pages (at 239 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:13:36 [scrapy.extensions.logstats] INFO: Crawled 12545 pages (at 238 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:14:36 [scrapy.extensions.logstats] INFO: Crawled 12789 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:15:36 [scrapy.extensions.logstats] INFO: Crawled 13030 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:16:36 [scrapy.extensions.logstats] INFO: Crawled 13267 pages (at 237 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:17:36 [scrapy.extensions.logstats] INFO: Crawled 13508 pages (at 241 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:18:36 [scrapy.extensions.logstats] INFO: Crawled 13751 pages (at 243 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:19:36 [scrapy.extensions.logstats] INFO: Crawled 13997 pages (at 246 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:20:36 [scrapy.extensions.logstats] INFO: Crawled 14242 pages (at 245 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:21:36 [scrapy.extensions.logstats] INFO: Crawled 14486 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:22:36 [scrapy.extensions.logstats] INFO: Crawled 14720 pages (at 234 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:23:36 [scrapy.extensions.logstats] INFO: Crawled 14960 pages (at 240 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:24:36 [scrapy.extensions.logstats] INFO: Crawled 15204 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:25:36 [scrapy.extensions.logstats] INFO: Crawled 15435 pages (at 231 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:26:36 [scrapy.extensions.logstats] INFO: Crawled 15679 pages (at 244 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:27:36 [scrapy.extensions.logstats] INFO: Crawled 15921 pages (at 242 pages/min), scraped 0 items (at 0 items/min) 2022-05-18 12:27:55 [scrapy.core.engine] INFO: Closing spider (finished) 2022-05-18 12:27:55 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 9019977, 'downloader/request_count': 16267, 'downloader/request_method_count/POST': 16267, 'downloader/response_bytes': 31214294, 'downloader/response_count': 16267, 'downloader/response_status_count/200': 15999, 'downloader/response_status_count/502': 268, 'elapsed_time_seconds': 3979.460375, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2022, 5, 18, 12, 27, 55, 486504), 'log_count/INFO': 76, 'log_count/WARNING': 1, 'memusage/max': 77799424, 'memusage/startup': 68546560, 'response_received_count': 15999, 'retry/count': 268, 'retry/reason_count/502 Bad Gateway': 268, 'scheduler/dequeued': 16267, 'scheduler/dequeued/memory': 16267, 'scheduler/enqueued': 16267, 'scheduler/enqueued/memory': 16267, 'start_time': datetime.datetime(2022, 5, 18, 11, 21, 36, 26129)} 2022-05-18 12:27:55 [scrapy.core.engine] INFO: Spider closed (finished)

Data

Downloaded 8 times by balling hkkenneth ScottBGI edwardfung123

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (3.37 MB) Use the API

rows 10 / 9807

alias_ch effective_date name_en last_update name_ch uid alias_en
1954-02-09
ROYAL SOCIETY OF ST. GEORGE, HONG KONG BRANCH, THE
2016-04-30
91/00075
1954-05-12
EVANGELICAL FREE CHURCH OF AMERICA
2016-04-30
91/00099
1981-10-13
HONG KONG STUDENT AID SOCIETY
2016-04-30
香港學生輔助會
91/00305
1969-04-24
TING WAI MONASTERY LIMITED
2016-04-30
定慧寺有限公司
91/00670
1972-01-27
INCORPORATED TRUSTEES OF THE WAH KIU YAT PO FUND FOR THE RELIEF OF UNDERPRIVILEGED CHILDREN, THE
2016-04-30
華僑日報救童助學運動基金
91/00777
1971-03-24
BRADBURY CHARITABLE TRUST FUND
2016-04-30
91/00787
1972-12-18
WAH YAN DRAMATIC SOCIETY LIMITED
2016-04-30
華仁戲劇社有限公司
91/00906
1973-02-27
SAN TIN RURAL COMMITTEE
2016-04-30
新田鄉鄉事委員會
91/00913
1974-03-11
VINCENTIAN FATHERS PROCURATION (ECONOMAT-GENERAL DE LA CONGREGATION DE LA MISSION)
2016-04-30
91/00983
1974-07-05
SPECIAL FUND, THE
2016-04-30
91/01040

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (3.37 MB) Use the API

rows 10 / 8983

name_en name_ch alias_en alias_ch effective_date uid last_update
SOCIETY OF JESUS IN HONG KONG, THE
香港耶蘇會
The Procurator in Hong Kong of The English Assistancy of The Jesuit Order
1949-03-11
91/00001
2018-07-31
DUPPUY FUND, THE
1950-06-23
91/00003
2018-07-31
CHATER MASONIC SCHOLARSHIP FUND TRUST
1949-11-11
91/00005
2018-07-31
MORRISON SCHOLARSHIP FUND
1955-09-12
91/00006
2018-07-31
HONG KONG AND FAR EAST MASONIC BENEVOLENCE FUND CORPORATION
1949-06-01
91/00008
2018-07-31
SHU PUN CHARITABLE ASSOCIATION LIMITED, THE
香港樹本善堂有限公司
1949-06-14
91/00009
2018-07-31
HONG KONG UNIVERSITY STUDENTS' UNION, THE
香港大學學生會
1949-09-13
91/00011
2018-07-31
UNION CHURCH
1949-09-13
91/00014
2018-07-31
ST. JOHN'S CATHEDRAL
St. John's Cathedral Endowment Fund
1949-09-13
91/00015
2018-07-31
SECRETARY FOR HOME AFFAIRS INCORPORATED (EX-CHINESE PUBLIC DISPENSARIES FUND)
1949-11-05
91/00018
2018-07-31

Statistics

Average successful run time: about 1 hour

Total run time: 3 months

Total cpu time used: 5 days

Total disk space used: 3.4 MB

History

  • Auto ran revision 01ea5fa5 and completed successfully .
    nothing changed in the database
  • Auto ran revision 01ea5fa5 and completed successfully .
    nothing changed in the database
  • Auto ran revision 01ea5fa5 and completed successfully .
    nothing changed in the database
  • Auto ran revision 01ea5fa5 and completed successfully .
    nothing changed in the database
  • Auto ran revision 01ea5fa5 and completed successfully .
    nothing changed in the database
  • ...
  • Created on morph.io

Show complete history

Scraper code

Python

HongKongCharities / scraper.py