balling / HongKongCharities

The list of charitable institutions and trusts of a public character, which are exempt from tax under section 88 of the Inland Revenue Ordinance


This is a scraper that runs on Morph. To get started see the documentation

Contributors balling

Last run completed successfully .

Console output of last run

Injecting configuration and compiling... Injecting scraper and running... 2016-06-01 17:35:50 [scrapy] INFO: Scrapy 1.1.0 started (bot: scrapybot) 2016-06-01 17:35:50 [scrapy] INFO: Overridden settings: {'LOG_LEVEL': 'INFO'} 2016-06-01 17:35:50 [scrapy] INFO: Enabled extensions: ['scrapy.extensions.logstats.LogStats', 'scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole'] 2016-06-01 17:35:50 [scrapy] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2016-06-01 17:35:50 [scrapy] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2016-06-01 17:35:50 [scrapy] INFO: Enabled item pipelines: [] 2016-06-01 17:35:50 [scrapy] INFO: Spider opened 2016-06-01 17:35:50 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:36:50 [scrapy] INFO: Crawled 527 pages (at 527 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:37:50 [scrapy] INFO: Crawled 1636 pages (at 1109 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:38:50 [scrapy] INFO: Crawled 2957 pages (at 1321 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:39:50 [scrapy] INFO: Crawled 4330 pages (at 1373 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:40:50 [scrapy] INFO: Crawled 5297 pages (at 967 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:41:50 [scrapy] INFO: Crawled 6651 pages (at 1354 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:42:50 [scrapy] INFO: Crawled 7984 pages (at 1333 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:43:50 [scrapy] INFO: Crawled 9307 pages (at 1323 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:44:50 [scrapy] INFO: Crawled 10608 pages (at 1301 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:45:50 [scrapy] INFO: Crawled 11607 pages (at 999 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:46:50 [scrapy] INFO: Crawled 12588 pages (at 981 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:47:50 [scrapy] INFO: Crawled 13966 pages (at 1378 pages/min), scraped 0 items (at 0 items/min) 2016-06-01 17:48:34 [scrapy] INFO: Closing spider (finished) 2016-06-01 17:48:34 [scrapy] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 6850782, 'downloader/request_count': 14999, 'downloader/request_method_count/GET': 14999, 'downloader/response_bytes': 17948574, 'downloader/response_count': 14999, 'downloader/response_status_count/200': 14999, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2016, 6, 1, 17, 48, 34, 652139), 'log_count/INFO': 19, 'response_received_count': 14999, 'scheduler/dequeued': 14999, 'scheduler/dequeued/memory': 14999, 'scheduler/enqueued': 14999, 'scheduler/enqueued/memory': 14999, 'start_time': datetime.datetime(2016, 6, 1, 17, 35, 50, 256411)} 2016-06-01 17:48:34 [scrapy] INFO: Spider closed (finished)

Data

Downloaded 1 time by balling

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (1.18 MB) Use the API

rows 10 / 8817

alias_ch effective_date name_en last_update name_ch uid alias_en
1950-06-23
DUPPUY FUND, THE
2016-04-30
91/00003
1949-06-01
HONG KONG AND FAR EAST MASONIC BENEVOLENCE FUND CORPORATION
2016-04-30
91/00008
1949-03-11
SOCIETY OF JESUS IN HONG KONG, THE
2016-04-30
香港耶蘇會
91/00001
The Procurator in Hong Kong of The English Assistancy of The Jesuit Order
1955-09-12
MORRISON SCHOLARSHIP FUND
2016-04-30
91/00006
1949-11-11
CHATER MASONIC SCHOLARSHIP FUND TRUST
2016-04-30
91/00005
1949-06-14
SHU PUN CHARITABLE ASSOCIATION LIMITED, THE
2016-04-30
香港樹本善堂有限公司
91/00009
1949-09-13
HONG KONG UNIVERSITY STUDENTS' UNION
2016-04-30
香港大學學生會
91/00011
1949-09-13
UNION CHURCH
2016-04-30
91/00014
1949-09-13
ST. JOHN'S CATHEDRAL
2016-04-30
91/00015
St. John's Cathedral Endowment Fund
1950-02-23
HONG KONG - MACAO CONFERENCE OF SEVENTH-DAY ADVENTISTS
2016-04-30
基督復臨安息日會港澳區會
91/00021

Statistics

Average successful run time: 13 minutes

Total run time: 17 minutes

Total cpu time used: 3 minutes

Total disk space used: 1.21 MB

History

  • Manually ran revision 8b08a486 and completed successfully .
    8817 records added, 229 records removed in the database
  • Manually ran revision 513aa642 and failed .
    229 records added in the database
  • Created on morph.io

Scraper code

Python

HongKongCharities / scraper.py