profibadan / PPP_Scraper

ScraperWiki implementation to extract data from PPP monthly pdf docs. For use by Morph.io.

Scrapes www.un.org

الأمم المتحدة 联合国 United Nations Nations Unies Организация Объединенных Наций Las Naciones Unidas


Last run completed successfully .

Console output of last run

Injecting configuration and compiling... Injecting scraper and running... The pdf file has 157757 bytes After converting to xml it has 472798 bytes The first 200 characters are: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE pdf2xml SYSTEM "pdf2xml.dtd"> <pdf2xml producer="poppler" version="0.24.5"> <page number="1" position="absolute" top="0" left="0" height="1188" width=

Data

Downloaded 3 times by profibadan

To download data sign in with GitHub

Download table (as CSV) Download SQLite database (124 KB) Use the API

rows 10 / 957

T tcc tccIso3Num type date mission F tccIso3Alpha M dateString
3
Albania
008
Individual Police
20141130
UNMISS
0
ALB
3
11/30/14
5
Algeria
012
Experts on Mission
20141130
MONUSCO
0
DZA
5
11/30/14
1
Argentina
032
Individual Police
20141130
MINURSO
1
ARG
0
11/30/14
3
Argentina
032
Experts on Mission
20141130
MINURSO
0
ARG
3
11/30/14
19
Argentina
032
Individual Police
20141130
MINUSTAH
1
ARG
18
11/30/14
567
Argentina
032
Contingent Troop
20141130
MINUSTAH
38
ARG
529
11/30/14
268
Argentina
032
Contingent Troop
20141130
UNFICYP
16
ARG
252
11/30/14
6
Argentina
032
Individual Police
20141130
UNMIL
2
ARG
4
11/30/14
4
Argentina
032
Individual Police
20141130
UNMISS
0
ARG
4
11/30/14
3
Argentina
032
Individual Police
20141130
UNOCI
0
ARG
3
11/30/14

Statistics

Average successful run time: half a minute

Total run time: less than a minute

Total cpu time used: less than 10 seconds

Total disk space used: 172 KB

History

  • Manually ran revision 32bb14ee and completed successfully .
    957 records added, 957 records removed in the database
    2 pages scraped
  • Manually ran revision 32bb14ee and completed successfully .
    957 records added in the database
    2 pages scraped
  • Created on morph.io

Scraper code

PPP_Scraper