Download pdf version here.
In 2015 the Open Data Inventory (ODIN) assessed the coverage and openness of of cial statistics in 125 mostly low- and middle-income countries. Data in 20 statistical categories were assessed on 10 elements of coverage and openness. The assessments are objective: they record whether data are available and whether the data conform to standards for open data, but they do not attempt to assess the quality of the data. They also record the online location of the data, allowing others to verify the results.
ODIN scores are summarized by data categories and by the elements of data coverage and openness, creating a pro le of each country’s statistical system and its ability to deliver the information needed by governments, citizens, and the private sector to guide their decisions. In 2015 no country’s ODIN score reached 70 percent of the total possible points. The highest scoring country was Mexico, with a score of 68 percent, followed closely by Moldova and Mongolia. Rwanda, with a score of 59 percent, was 4th overall and the highest scoring country in Africa. The lowest scoring countries were found in parts of Africa, Asia, and Europe. Measured just on the elements of openness, Mexico was the clear leader with a score of 74 percent, followed by Rwanda and Moldova. Measured by data coverage, which considers the availability of key indicators over the last 10 years and for sub-national units, Cuba had the highest score, followed by China and Moldova.
There is more to be learned from the ODIN assessments. This rst annual report on the Open Data Inventory describes the assessment process and highlights signi cant patterns in the results. The appendixes list results for 125 countries and provide greater details on the assessment methodology as well as orientation for obtaining ODIN results online at http://odin.opendatawatch.com/.
What is ODIN?
The Open Data Inventory (ODIN) is an assessment of the coverage and openness of data provided on the websites maintained by national statistical of ces (NSOs). Each assessment covers twenty categories of social, economic, and environmental statistics. Each data category is assessed on ve elements of coverage and ve elements of openness. The twenty categories assessed by ten elements result in two hundred item scores for each country. Aggregate scores are computed across the categories and elements and for sub-groups of categories and elements. The overall ODIN score is an indicator of how complete and open an NSO’s data offerings are. The categorical scores for social, economic, and environmental statistics and summary scores for coverage and openness produce a picture of the national statistical systems’ strengths and weaknesses.
In its rst year of operation, ODIN contains complete assessments of 125 countries. More will be added with each annual update with the goal of including all recognized national statistical systems.
What is its purpose?
By providing a comprehensive view of the coverage and openness of of cial statistics, ODIN will help to identify critical gaps, promote open data policies, improve data access, and encourage dialogue between NSOs and data users. NSOs and their development partners can use ODIN as part of a strategic planning process and as a measuring rod for the development of the statistical system. ODIN also provides valuable information to data users within the government and private sectors and to the general public about the availability of important statistical series. In addition to the ratings of coverage and openness in twenty statistical categories, ODIN assessments record the online location of key indicators in each data category, permitting quick access to hundreds of indicators.
Why assess national statistical of offices?
ODIN assessments begin with the websites maintained by national statistical of ces because, in most countries, the NSO is the lead agency of the national statistical system, coordinating its work with other governmental bodies that produce of cial statistics. Of course, the responsibilities of NSOs differ from country to country. It is not uncommon for important statistics to be produced by autonomous agencies such as the central bank or ministries of education, health, or planning, which may not report directly to the NSO or the chief statistician.
If a national data source can be reached from the NSO’s website, it is included in the ODIN assessment. However, external data portals, such as the African Development Bank’s Open Data for Africa (http:// www.afdb.org/en/knowledge/statistics/open-data- for-africa/), the World Bank Data website (http://data.worldbank.org/), or the many databases maintained by other international agencies (see http://data.un.org/) are not included in the assessment.
NSOs, as producers and caretakers of of cial statistics, have a special obligation to maximize their public bene t. NSOs can and should be the leading advocate for and provider of high quality, of cial statistics to government, the public, and to the international community. Indeed, an NSO that subscribes to the United Nations’ Fundamental Principles of Of cial Statistics is committed by Principle 1 to provide “… of cial statistics that meet the test of practical utility … compiled and made available on an impartial basis by of cial statistical agencies to honor citizens’ entitlement to public information.”
How are Open Data defined?
There is general agreement on the core meaning of open data. As summarized in the Open De nition version 2.1 (http://opende nition.org/od/2.1/en/) “Knowledge is open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness.” This concept has been elaborated by governmental and non- governmental organizations. The International Open Data Charter (http://opendatacharter.net/) provides one of the most detailed explanations.
In practical terms open data should be machine readable in non-proprietary formats, selectable by users, accompanied by accurate metadata, and free to be used and reused for any purpose without limitations other than acknowledgement of the original source. These core elements have been incorporated into the ODIN assessment. A more detailed description of the elements on which openness is scored is available in the methodological appendix to this report.
What data categories are included in ODIN?
ODIN assessments review published statistics in twenty categories, grouped as social statistics (eight categories), economic and nancial statistics (seven categories), and environmental statistics ( ve categories). Although each group contains a different number of data categories, the ODIN overall scores weight the three groups equally. For each category, a small set of principal or sentinel indicators has been identi ed. These were selected because they are frequently needed for developing and implementing public policies or private initiatives and because they provide evidence of underlying statistical processes for which statistical of ces are responsible. The guidelines for assessing data coverage in each category are described in the methodological appendix to this report. All of the data included in ODIN may be termed “macro” data, in contrast to the “micro” or unit-record data collected through censuses, surveys, and administrative records. ODIN does not evaluate micro data because the approach to assessing their openness would be quite different. However, the principles of open data are just as important for public access to properly curated micro databases. The International Household Survey Network maintains a global catalog of censuses and surveys (http:// catalog.ihsn.org/index.php/catalog), which includes information about public access to the data.
Coverage and Openness
National statistical of ces are the gatekeepers of of cial statistics systems with a unique responsibility and opportunity to provide the information needed to make and implement policies and evaluate results. In many countries NSOs struggle to carry out their functions with insuf cient resources and in the face of political indifference or, in some cases, interference. Thus, it is not surprising that ODIN reveals signi cant gaps in the coverage and openness of statistics available from national statistical of ces. The ODIN results also show that many poor countries are doing a good job of providing open access to valuable statistics, and many more – rich or poor – could improve their results at very little cost.
Among the 125 countries included in the 2015 ODIN assessments, the median overall score, across 20 categories of statistics was 30 out of 100. The highest score was Mexico at 68 out of 100, and the lowest score was Uzbekistan at 3. Scores for the elements of data coverage were, on average, higher than scores for data openness. The median coverage score was 38 out of 100 and the median openness score was 20 out of 100, showing that even where data are available, they often do not satisfy the standards for openness. As the distribution of scores in Figure 1 shows, countries with higher overall scores tend to have proportionately higher openness scores.
The other low-scoring elements of openness were the provision of non-proprietary data formats and systems to allow users to select their own dataset. Non-proprietary formats, such as, plain text, CSV, or XML, allow users to process data with open software packages, many of which are available for free. An example of a machine-readable, but proprietary format is an Excel le or any other format that requires a user to own specialized software to make use of the data. Non-machine readable formats include PDF les and graphics formats such as JPG or GIF. Providing machine readable data downloads in non-proprietary formats could add as much as 20 points to the overall scores of 83 countries.
Figure 3 shows the average scores for the five elements of data coverage. The first element, indicator coverage, shows that on average 60 percent of the sentinel indicators were available. The highest score on this element was 93.7 for Moldova; the lowest score was 6.7 for Armenia. But in many countries, data were available for less than half the years in the past decade and far fewer provided disaggregated data for sub-national administrative units.
Results by Data Categories
Averaging scores across all ten elements gives a view of the availability and openness of data in the 20 categories of statistics included in ODIN. Figure 4 shows that international trade data score highest and that economic data (in yellow) categories occupy six of the seven highest scoring positions. Social statistics (in blue) fall in the mid-range or lower, with the exception of population vital statistics. Population data are needed to construct sampling frames and to standardize other indicators, so it is encouraging that these data are more widely available. However, many categories of data needed to measure the Sustainable Development Goals have much lower scores. Environmental categories (in green) are among the lowest, with the exception of the built environment, which includes measures of access to water and sanitation that featured prominently in the Millennium Development Goals.
Although Figure 1 shows that openness accounts for the larger part of the variation in ODIN scores, data coverage makes up the largest share in the overall ODIN score. Economic categories have higher scores because the data have better coverage, not because they are more open. In most categories, coverage accounts for 60 to 65 percent of the average score. The environmental categories are the exception, where the coverage component falls to as little as 45 percent of the average score.
Countries included in the 2015 ODIN assessments come from 6 continents, divided into 16 sub-regions. Table 1 shows average scores for data coverage and openness and the overall average score for each region and sub-region. European as a whole has the highest average scores, but the average scores in Eastern Asia – comprised of China and Mongolia – exceed all other sub-regions, and there are sub-regions in Africa, the Americas, and Asia that exceed the averages of Southern Europe. The lowest average scores are found in Middle Africa and in the Pacific Islands of Oceania. In both cases, the largest deficit occurs in the openness scores.
|Average scores||No. of Countries||Overall||Coverage||Openness|
|Oceania (Pacific Islands)||8||16.4||25.2||8.2|
Income Group Results
Do richer countries perform better than poorer ones? In Table 2, countries are grouped by the World Bank’s 2015 income classification. A look at the average scores by income and regional groups gives a mixed picture. The average scores of the small number of high-income countries included in the 2015 assessments are generally lower than those of upper-middle-income countries, and the average scores of the two lower-middle-income countries in Eastern Europe (Moldova and Ukraine) exceed those of the three upper-middle-income countries (Belarus, Bulgaria, and Romania) from the same region.
Table 2: ODIN scores by region and income group
|Average ODIN scores (%)||Low income||Lower-middle income||Upper-middle income||High income|
|Number of countries||26||48||44||6|
Looking at individual scores and income levels reveals even greater heterogeneity. Figure 5 plots each country’s overall ODIN score against GDP per capita, measured in purchasing power parity (PPP) dollars. GDP values from 2011 were used, because data from the International Comparison Project for that year are available for the largest number of countries. Two countries, Argentina and Anguilla, do not have 2011 values for GDP per capita. One country, Saudi Arabia, whose GDP per capita is twice that of the next highest country, has been left out to allow better scaling of the graph.
The median score for the 120 countries shown in Figure 5 is 29.6, and the median income is $5,700. A trendline placed through the data has a slight upward slope, rising by 0.7 points per $1,000 of GDP per capita. But the R-squared statistic, which measures the percentage of variance “explained” by the fitted line, is less than 10 percent. As can be seen, there are a large number of poor countries – 25 out of 60 in the lower half of the income distribution – with scores above the median. Indeed, two of the highest scoring countries are Moldova with a GDP per capita of $4,200 and Rwanda with a GDP per capita of $1,400. Mexico, the highest scoring country in the 2015 ODIN assessment, ranks only 17th in GDP per capita.
ODIN scores for coverage and openness and the aggregate scores over the social, economic, and environmental data categories all exhibit a similar pattern of weak association with average income. While this result certainly raises questions about why some relatively wealthy countries are not able to provide more complete and open data for their citizens and the global public, we take it as an encouraging sign that improvements in data coverage and openness are possible even in small and poor countries.
Focus on gender data
ODIN scores can be weighted to emphasize particular categories of data or specific elements of coverage and openness. The weighting function is not yet incorporated in the ODIN Online website, but will be in the future. Meanwhile, the raw scores – the original scores on each category and element – can be downloaded and used to calculate new aggregates. As a demonstration, consider a new indicator of the coverage and openness of gender relevant statistics.
The ODIN assessments contain 8 categories of data that have particular relevance for monitoring the status and welfare of women. Many of the sentinel indicators examined in the ODIN assessments have been included in the list of proposed Sustainable Development Goal indicators.
The eight categories are:
1. Population & vital statistics
2. Education outcomes
3. Health outcomes
4. Reproductive health
5. Gender statistics
6. Poverty & income statistics
7. Labor statistics
8. Built environment
With the exception of the built environment, the scoring guidelines for the first element of indicator coverage for these categories specify sex disaggregation of the indicators. The built environment is included because it provides data on household access to water and sanitation, issues of particular concern to women and children. The gender statistics category provides data on gender violence and political participation of women. The other nine elements of coverage and openness do not differentiate gender disaggregated data, but, taken together, their scoring reflects systematic differences in the coverage or openness of these data categories. In constructing the gender indicator, all elements and data categories are weighted equally.
The new index, with its greater weight on gender-relevant statistical categories, shifts the ranking of some countries dramatically while leaving others unchanged. (See Figure 6.) El Salvador jumps up 74 places to 17th among 119 countries for which gender rankings were calculated, while Ukraine, ranked 12th overall on the full ODIN score, drops to 42nd. Mexico remains unchanged as the top ranked country. The results tell us that there are many places where greater attention is needed to ensure that gender relevant statistics are available for monitoring the sustainable development goals (SDGs).
Planning for Open Data
The road to open data and open statistical systems is complicated to navigate and must be carefully planned. There are legal, technical, and organizational issues that must be brought into alignment. Strong buy-in from many stakeholders is also needed. In planning the development of their statistical systems, many countries have adopted the framework for National Strategies for the Development of Statistics (NSDS), promulgated by the Partnership in Statistics for Development in the 21st Century (PARIS21). ODIN assessments can serve as a complement to the NSDS process, highlighting strengths and weaknesses in the current system and measuring progress toward a more open system.
An NSDS is a planning tool to guide the development of national statistical systems capable of producing the data necessary to design, implement, and monitor national development policies and programs and to meet their regional and international data commitments. Since the first NSDS guidelines were formalized more than 10 years ago, NSDSs have been prepared by 95 countries, with some now in their second or third generation. (See http://www. paris21.org/Knowledge/381.) PARIS21 provides support and advocacy for NSDSs and, in 2014, launched an updated set of guidelines. Both earlier and recent NSDS 2.0 Guidelines include similar elements, such as assessment of the existing system, stakeholder analysis and engagement, vision and goals, organizational structure, monitoring and evaluation, legal framework, timeline and action plan, a framework for external assistance, and a budget. The new PARIS21 NSDS guidelines incorporate additional elements and guidance on specific topics, including gender data and openness based on 10 years of experience on all continents. The Guidelines are available at http://nsdsguidelines.paris21.org/.
Before countries can implement a plan for open data, they should incorporate the necessary steps in their NSDSs. Among the factors affecting openness cited in the new guidelines are: data confidentiality; establishing legal authority for open data; setting data dissemination goals and targets; and building an open data IT platform. However, a review of NSDSs in fourteen countries conducted by Open Data Watch staff found open data to be among the least frequently mentioned issues identified in the plans. While plans for Mongolia and Timor-Leste made frequent mention of open data principles, nine other countries’ plans made less frequent mention of openness, and Ethiopia, Tanzania, and Zimbabwe made weak or no mention of data openness. The fourteen countries in the NSDS study included seven ranked in the top half of the ODIN openness assessment, among them Mongolia (ranked 4th) and Rwanda (ranked 2nd). Ethiopia, Tanzania, Zimbabwe, whose NSDSs had a weak coverage of open data issues, are all ranked in the bottom thirty percent in the ODIN openness assessment.
The Open Data Inventory is a continuing project of Open Data Watch. Assessments will be carried out annually, beginning in the second quarter of the calendar year. The updated results will be posted in their entirety at the end of the year. All countries included in 2015 will be assessed again in 2016 and fifty or more countries will be added, including many high-income countries. With adequate resources, the intention is to extend ODIN to all recognized official statistical systems.
Although the ODIN methodology has proved to be reliable and reproducible over the course of the 2015 assessment, further changes are possible. We are interested in receiving feedback and suggestions for improvements. NSO websites and their contents frequently change, and we would also appreciate information on new or updated sites or the location of information that could not be found during the 2015 assessment. Feedback of any kind can be sent by email to email@example.com.
I. 2015 ODIN Scores and Rankings
|Russian Federation||East Europe||54.1||7||60.2||7||48.5||9|
|Sri Lanka||Southern Asia||43.6||21||57.4||10||30.9||31|
|South Africa||Southern Africa||38.2||33||49.9||27||27.4||42|
|Saudi Arabia||West Asia||35.1||43||39.6||58||30.9||31|
|Cote d’Ivoire||West Africa||34.8||44||36.3||71||33.4||27|
|Iran, Islamic Rep.||Southern Asia||34.3||48||44.1||45||25.3||48|
|Cabo Verde||West Africa||33.5||51||44.3||44||23.4||53|
|Macedonia, FYR||Southern Europe||33.1||53||22.1||113||43.3||14|
|South Sudan||East Africa||29.7||61||30.8||89||28.8||36|
|Burkina Faso||West Africa||24.9||79||41.7||51||9.3||97|
|Sierra Leone||West Africa||24.5||80||36.4||70||13.6||85|
|St. Vincent & Grenadines||Caribbean||21.8||91||23.5||111||20.2||60|
|El Salvador||Central America||21.0||94||28.6||98||14.0||83|
|Lao PDR||Southeast Asia||20.9||95||34.3||80||8.4||105|
|Bosnia and Herzegovina||Southern Europe||19.8||100||31.9||87||8.6||103|
|Congo, Rep.||Middle Africa||19.4||102||30.1||94||9.6||96|
|Solomon Islands||Pacific Islands||18.1||106||27.0||105||9.9||95|
|Micronesia, Fed. Sts.||Pacific Islands||17.3||108||29.3||96||6.2||116|
|Marshall Islands||Pacific Islands||14.8||114||23.9||110||6.5||114|
|Congo, Dem. Rep.||Middle Africa||14.1||115||22.1||113||6.8||113|
|Papua New Guinea||Pacific Islands||10.6||118||17.3||118||4.4||120|
|Sao Tome and Principe||Middle Africa||9.6||120||15.7||119||4.0||121|
II. Other measures of open data
The Global Open Data Index (http://index.okfn.org/) and the Open Data Barometer (http://www. opendatabarometer.org/) are well known measures of the openness of government produced datasets. The Open Data Barometer (ODB) employs an expert assessment system that relies on scoring by local informants on questions concerning the policies, implementation, and impacts of open government data initiatives. Secondary data are used to complement the expert survey data and assess the readiness of countries to implement open government data initiatives. (See “Methods and “Overview,” http://www. opendatabarometer.org/report/about/method.html.) The Global Open Data Index (GODI), produced by the Open Knowledge Foundation, is a crowd-sourced indicator of the openness datasets. Information on datasets is gathered through the Open Data Census. The census is “… compiled using contributions from civil society members and open data practitioners around the world, to which the public is invited to contribute at any time; it is then peer-reviewed and checked periodically by a team of 60+ expert country editors.” (See “About the Open Data Index,” https://index.okfn.org/about/) and measuring progress toward a more open system.
Unlike the Open Data Inventory, both indexes include non-statistical information in their assessments, such as national maps, land ownership records, transport timetables, postcodes, government budgets, company registers, and election results. Both indicators include a limited selection of datasets produced by national statistical of ces, such as the national accounts, unemployment, and population estimates, but their measures leave out much of the data traditionally associated with of cial statistics. Both indexes have prioritized high-income countries. This results in limited overlap with the countries assessed by ODIN.
Another index of interest is the World Bank’s Statistical Capacity Indicator (http://datatopics. worldbank.org/statisticalcapacity/). The Statistical Capacity Indicator (SCI) differs from the ODB and GODI in several respects. It considers only the datasets that are traditionally the responsibility of the national statistical of ce, although modern statistical systems may produce many other kinds of information; the criteria by which datasets are evaluated are derived from published information, rather than the judgment of experts or data users; and it is available for 149 developing countries but not for most countries classi ed by the World Bank as high income. It does not explicitly consider whether the datasets satisfy criteria for openness. Instead, it is intended to measure the capacity of the country to produce statistics of good quality.
As the cardinal values of the four indexes are not directly comparable, Figures 7a, 7b, and 7c show the rankings of countries by ODIN scores (blue bars) alongside their rankings by GODI, ODB, or SCI scores for the countries they have in common (orange
bars), scaled from 0 to 100. As might be expected there are signi cant differences in the rankings. The ODB, for example, gives its highest ranking to Chile, which, among the countries the two indexes have in common, is ranked in the 62nd percentile by ODIN. At the other end of the scale, Malaysia, which is ranked in the 6 percentile by ODIN, is in the 76 percentile for the ODB. On the GODI, the highest ranking country is Colombia, which is in the 75th percentile in ODIN, while Jamaica in the 81st percentile of the GODI is ranked in the 7th percentile by ODIN. The simple correlation between ODIN and the ODB is 57 percent; the correlation with the GODI is 44 percent.
The SCI, with its focus on the production of of cial statistics in developing countries, has more countries in common with ODIN and, at 63 percent, a somewhat higher rank correlation of overall scores. Still there are notable differences. Among the largest are Kosovo, ranked in the 5th percentile by the SCI and in the 76th by ODIN, and El Salvador, ranked in the 24th percentile by ODIN but in the 93rd by the SCI.
The differences among the indexes suggest that governments have responded to the demand for open data in different ways. Some have opened and strengthened their statistical systems. Others have been more responsive to the demand for public disclosure of government operations and the release of commercially useful datasets. Some of have done both and others have done neither. Further investigation is warranted.
III. ODIN concepts and methodology
The following sections explain the assessment methodology and the assumptions underlying the 2015 ODIN assessments.
The Open Data Inventory assesses the coverage and openness of statistics available from websites maintained by national statistical of ces. Websites maintained by private or non-governmental agencies or international agencies are not included in the assessment. Websites maintained by other units of the national government or by sub-national governmental units are included if and only if they can be reached from the national statistical of ce website.
For example, if the national accounts are maintained by the central bank, then data would be included
in the ODIN assessment only if the NSO’s website provides a link to the appropriate page on the central bank’s website or if the NSO reproduces the data on its own website. ODIN is premised on the belief that NSOs can and should take responsibility for providing access to all of cial statistics.
The Open Data Inventory assesses macrodata. By this we mean data that have been aggregated above the unit record level. We focus on these data because they are the nal product released by the NSO or other of cial agencies They are used most frequently for policy making and for tracking policy outcomes. Microdata from censuses and surveys are very important, but require a different approach to assessing their openness.
Twenty categories of data are included in the ODIN assessment. Table A3-1 lists the data categories and the sentinel indicators and recommended disaggregations in each category. For the construction of summary measures, the data categories are grouped as social statistics, economic statistics, and environmental statistics.
Data category: Social Statistics
1. Population and vital statistics
Sentinel indicators: Population by 5-year age groups; crude birth rate; crude death rate
Recommended disaggregation: Sex; Marital status
2. Education: Facilities
Sentinel indicators: Number of schools and classrooms; teaching staff; annual budget
Recommended disaggregation: Age group; School stage
3. Education: Outcomes
Sentinel indicators: Enrollment and completion rates; literacy rates and/or competency exam results Recommended disaggregation: Sex; School stage; Age groups
4. Health: Facilities
Sentinel indicators: Core operational statistics of health system (budget, clinics, hospital capacity, doctors, nurses, midwives)
Recommended disaggregation: Facility type
6. Health: Reproductive health
Sentinel indicators: Maternal mortality ratio; infant mortality rate; under-5 mortality rate; fertility rate; contraceptive prevalence rate; adolescent birth rate
Recommended disaggregation: Mortality rates disaggregated by sex
7. Gender statistics
Sentinel indicators: Specialized studies of the status and condition of women; violence against women; women in parliament and management
Recommended disaggregation: None
8. Poverty Statistics
Sentinel indicators: Number and percentage of poor at national poverty line; distribution of income
Recommended disaggregation: Median income; income shares by deciles
Data category: Economic Statistics
9. National accounts
Sentinel indicators: Production by industry; expenditure by government and households
Recommended disaggregation: Production by industrial classi cation; Current and constant prices
10. Labor statistics
Sentinel indicators: Employment; unemployment
Recommended disaggregation: Sex; Major age groups; Employment by industry and occupation
11. Price indexes
Sentinel indicators: Consumer price index; Producers price index
Recommended disaggregation: By major components
12. Central government nance
Sentinel indicators: Actual revenues; actual expenditures
Recommended disaggregation: Revenues by source; Expenditures by major categories
13. Money and banking
Sentinel indicators: Money supply
Recommended disaggregation: M1; M2; and so forth
14. International trade
Sentinel indicators: Exports and imports
Recommended disaggregation: Major categories using international trade classic cation
15. Balance of payments
Sentinel indicators: Exports and imports of goods and services; foreign investment; foreign exchange rates
Recommended disaggregation: Goods and services disaggregated by principal industry groupings
Data category: Environment Statistics
16. Land use
Sentinel indicators: Land area
Recommended disaggregation: Urban; rural; cropping
17. Resource use
Sentinel indicators: Fishery harvests; forests coverage and deforestation; major mining activities including gas/petroleum; water supply & use
Recommended disaggregation: Data in physical units; Location as appropriate
18. Energy use
Sentinel indicators: Consumption of electricity, coal, oil, and renewables Recommended disaggregation: Industry; households; in physical units
Sentinel indicators: Emissions of air and water pollutants; CO2 and other GHG; toxic substances Recommended disaggregation: In physical units
20. Built environment
Sentinel indicators: Access to drinking water; access to sanitation; housing quality (from census) Recommended disaggregation: In appropriate units
Elements of Data Coverage and Openness
The data categories are assessed against ten elements of coverage and openness shown in Table A3-2. Each element has a possible score of 1, 0.5, or 0, indicating that the data in a category satisfy the criteria for that element, partially satisfy them, or fail to satisfy them or the data are entirely missing. Thus a country has a maximum potential score of 200: 100 for data coverage and 100 for data openness. The scoring scheme is deliberately coarse. A ner scoring grid (say from 1 to 10) would inevitably invite greater subjectivity on the part of assessors and create problems when comparing results produced by different assessors or at different times. The scoring guidelines for each element are summarized in Table 3.
|Elements of Data Coverage|
|Indicator coverage and disaggregation||Representative indicators and disaggregations available|
|Time coverage||Data available in last 5 years|
|Data available last 10 years|
|Geographic||First admin level|
|Second admin level|
|Elements of Data Openness|
|Download format||Machine readable|
|User selectable/API or bulk download|
ODIN Scoring Guidelines
Element 1: Coverage and Disaggregation The first element requires assessors to locate representative indicators within each data category and determine whether important topical disaggregations are available. Guidelines for scoring each data category are shown in Table A3-3. The representative indicators and disaggregations are listed in Table A3-1 above. In the event that the score for element 1 is less than 1, the remaining four elements of data coverage cannot exceed the score of element 1. However, the elements of data openness (elements six through 10) are scored on the basis of available data, which may receive a full score for openness if they satisfy the guidelines for those elements. If no data are available for a category, all elements are scored 0.
Table 3: Scoring Guidelines for Element 1: Indicator Coverage and Disaggregation
|Population and vital statistics||If population data not available by at least 5-year age groups, score no more than 1/2 point; if sex missing, subtract 1/2 point. Birth and death rates are not disaggregated by age|
|Education: Facilities||Breakdown by school stage (primary, lower secondary, secondary, tertiary) score 1/2 point; additional detail including age groups and/or school types (technical training; apprenticeship programs, and so forth) gets an additional 1/2 points|
|Education: Outcomes||Score 1/2 point for enrollment and completion rates by school stage or type; score 1/2 point for exam results; If not disaggregated by sex, subtract 1/2 point|
|Health: Facilities||Score 1/2 point if at least 3 representative indicators present; score 1/2 point more if disaggregated by facility type|
|Health: Preventative care and morbidity||Score 1/2 point for immunization data; score 1/2 point for disease incidence or prevalence. Subtract 1/2 point if not disaggregated by sex|
|Health: Reproductive health||Score 1/2 point for mortality rates; score 1/2 point for fertility, contraceptive prevalence, and adolescent birth rate; subtract 1/2 point if infant and under-5 mortality rates not disaggregated by sex|
|Gender statistics||Score 1/2 point for data on violence against women; score 1/2 point for data on women in management or political of ce; special studies that include similar information score 1 point. Disaggregation optional|
|Poverty and income statistics||Score 1/2 point for poverty headcount; 1/2 point for income distribution by deciles or ner. Disaggregation optional|
3: Scoring Guidelines for Element 1: Indicator Coverage and Disaggregation
|National accounts||Score 1/2 point for production by industry; score 1/2 point for expenditure data. Subtract 1/2 point if industrial production (value added) not disaggregated by major industry groups: agriculture (including forestry and shing), industry, and services|
|Labor statistics||Score 1/2 point for employment; score 1/2 point for unemployment; subtract 1/2 point if not disaggregated by sex; subtract 1/2 point if no age group data|
|Price indexes||Score 1/2 point for CPI; score 1/2 point for PPI. Disaggregation optional|
|Central government nance||Score 1/2 point for budget disaggregated by budget categories; score 1/2 point for actual revenues and expenditures by major categories. No points if only totals given|
|Money and banking||Score 1/2 point monetary aggregates; score 1/2 point for data on the banking system such as total credit to private sector or public sector|
|Trade||Score 1 point if exports and imports of goods disaggregated by major product categories|
|Land use||Score 1/2 point if disaggregated by urban/rural or environmental zones; score 1/2 point if disaggregated by agricultural uses (forest, arable, cropping)|
|Resource use||Score 1/2 point for any two categories; score 1 point for all|
|Energy use||Score 1/2 point for any two categories; score 1 point for three of four. Subtract 1/2 point if electricity not disaggregated by industry and household consumption|
|Pollution||Score 1/2 point for CO2 and other greenhouse gases; score 1/2 point for other emissions and pollutants if source identi ed|
|Built environment||Score 1/2 point for access to water and sanitation; disaggregation by
facility type optional; score 1/2 point for housing quality information with disaggregation by characteristics such as housing type, construction material, or number of rooms
Elements 2 through 5: Other Elements of Data Coverage
Scoring guidelines for the data coverage elements 2 through 5 are summarized in Table 4. Elements 2 and 3 assess the availability of annual data within each category over the 10-year period, 2006 – 2015. Although many countries now provide quarterly data for economic indicators, scoring is based only on annual values. Elements 4 and 5 score the availability of subnational data at the level of rst and second administrative units. Assessors are instructed to determine the administrative levels from of cial sources. Certain categories of economic statistics are not expected to be available for rst or second administrative levels; no scores are recorded for those categories.
Table A3-4. Scoring Guidelines for Elements of Data Coverage
|1. Indicator coverage and disaggregation – see Table 3|
|2. Data coverage for the last 5 years
a. 1 point if data are available for 3 of the last 5 years
b. 0.5 points if data are available for 1-2 of the last 5 years
c. 0 points if data are unavailable for last 5 years
|3. Data coverage for the last 10 years
a. 1 point if data are available for 6 of the last 10 years
b. 0.5 points if data are available for 3-5 of the last 10 years
c. 0 points if data are unavailable for 2 or fewer of last 10 years
|4. First administrative level
a. 1 point if data available at first subnational level (state, province, and so forth)
b. 0.5 if some data available at first subnational level
c. 0 points if data only available at national level
|5. Second administrative level
a. 1 point if data available at two levels of subnational level (municipality or other similar division)
b. 0.5 if some data available at second subnational level
c. c. 0 points if no data available at this level
Elements 6 through 10: Data Openness
Elements 6 through 10 assess the openness of data in a category using criteria derived from the Open De nition. (See http://opende nition.org/.) Scores for coverage and openness were considered independently. If only one indicator for a certain category was published but that indicator was published in a fully open, it was given full points for openness. Scores for openness could, therefore, exceed the scores for coverage in the same category, but in practice this rarely happens. The scoring guidelines for the elements of openness are shown in Table 5.
Elements 6 and 7 assess whether data are downloadable in machine readable, non-proprietary formats. Open data should be available to anyone in convenient and readily modi able form. Element 8 asks whether users can select the data they are interested in and whether they are able to establish an API connection to the data, which would allow data to be linked to other applications. The alternative is often that data are only available in predetermined tables. The availability of metadata (element 9) is of importance in providing users with information on how the data were collected and compiled. Clear licensing terms (element 10) state what users may do with the data and permit for reuse of data with some restrictions; fully open data may be used and reused without restriction other than providing attribution to the original source.
|6. Machine readable format
a. 1 point if data are downloadable in a machine-readable format (such as XLS, CSV, Stata, SAS, and so forth)
b. 0.5 point if some but not all the data are downloadable in machine-readable format
c. 0 points if data are not available in machine-readable format (such as HTML, JPEG, PDF)
|7. Non-proprietary format
a. 1 point if data downloads are in non-proprietary format (such as CSV)
b. 0.5 point if some but not all data are available non-proprietary format
c. 0 points if data are not available in non-proprietary format (such as XLS, Stata, SAS, PDF, JPEG)
|8. User selection/ API or bulk download
a. 0.5 points if user can select specific indicators from a dashboard for download; 0 otherwise.
b. Add 0.5 points if an Application Program Interface (API) or other mechanism is available that allows for bulk download.
|9. Metadata available
a. 1 point if metadata are present that provide specific details about the definition of the indicator or the method of data collection and compilation for that indicator
b. 0.5 points if metadata are provided about a large survey or group of data of which the indicator is part. It may require a search of a different section of the website than where the data are to find such metadata.
c. 0 points if no metadata are available.
|10. Licensing terms
Aggregate ODIN Scores
ODIN scores are summarized along both dimensions of the ODIN assessment: by categories and by elements. In addition, subscores are computed for the combined categories of social statistics, economic statistics, and environmental statistics and for the combined elements of coverage and openness. The overall score aggregates all scores across both dimensions. For convenience, all aggregate scores are standardized by rescaling them to a range of 0 to 100.
Because the three principal topical groupings (social, economic, and environmental) contain different numbers of data categories, aggregates computed over these categories would be implicitly weighted by the number of categories in each grouping.
To neutralize this effect, the data categories are reweighted so that each group has equal weight in aggregates computed over all categories. The reweighting does not affect aggregates computed within each grouping. All elements have equal weights in all aggregates. ODIN Online has an option for downloading both the raw and weighted scores for further analysis. In the future an option for user-speci ed weights for both categories and elements will be included in the online version of ODIN.
The aggregate scores shown in ODIN tables and charts have been standardized. Scores are standardized by dividing by the maximum score achievable and multiplying by 100. For most subscores, the maximum score is the product of the number of data categories and the number of elements included. However, some of the elements of geographic disaggregation have been excluded a priori from the economic categories. Speci cally, it is assumed that the national accounts and government nance statistics will not be available at the second administrative level and that money and banking,international trade, and balance of payments statistics will not be available at the rst or second administrative levels. Therefore, the maximum, unweighted score for ve data coverage across all seven economic categories is 27 not 35 and the maximum achievable score over all data categories and elements is 192 not 200. Standardized scores involving any of these categories are reweighted to give them full weight. Because of this discrepancy, subscores over data categories or across elements involving economic statistics will not “add up” consistently, but the treatment of each subscore is internally consistent.
IV. Accessing ODIN Online
The ODIN website is located at: http://odin.opendatawatch.com/. The website should be easy to navigate without additional instructions, but here is a short guide to what you will find.
• The Home page displays a map of the world, showing in color the countries that have been included in the 2015 ODIN assessment. Colors indicate the range of their overall ODIN score. Countries in gray were not include in the 2015 ODIN assessments
• Clicking on a country brings up an information box with the country’s aggregate scores and rank. Clicking on the country name takes you to the Country Pro le page. (See below.)
• The Rankings page displays the overall score and aggregate subscores for data coverage and openness for all countries. The display can be sorted by country name, region, or scores by clicking on the table headers.
• The Rankings dataset can be downloaded with the Export button.
• The Country Pro le page provides the most detailed information on a country’s ODIN scores. Summary scores are shown for the 20 data categories (aggregated over the elements of coverage and openness) and for the 10 elements of coverage and openness (aggregated over the social, economic, and data categories). Graphs provide regional and global comparisons.
• ODIN countries are grouped by geographic regions and sub-regions de ned by the United Nation Statistics Division’s M49 Macro Geographical Regions and Sub-Regions Listing (http://unstats. un.org/unsd/methods/m49/m49regin.htm). Country codes are three character ISO codes. ODIN also includes the Republic of Kosovo with ISO code XKS, which is not included in the UN list. Three character regional codes were created for use in ODIN and are not part of the M49 listing.
• ODIN countries have also been classi ed by the World Bank’s income groups. On the Regional Pro le page you can choose to view countries grouped by geographic region or by income group. First select the type of display, then select the regions and sub-regions.
• Data from the Country Pro le page can be downloaded with the Export button.
• The Country Comparison page allows users to tabulate aggregate scores for one or more countries. The overall score and ve scores aggregated over categories and elements are displayed.
• First select the regions or sub-regions from which to select countries; then select some or all of the countries.
• Data from the Country Comparison page can be downloaded with the Export button. The “spark charts” to the right of the table do not download.
• The Data Download provides access to the full ODIN dataset at the item level. Three types of scores can be selected: raw, weighted, and standardized. Raw scores are the original scores recorded by the assessors. Weighted scores have been multiplied by a weighting matrix that gives greater weight
to the environment and economic data categories in order to compensate for the fewer number of categories in the overall score. Standardized scores are derived from the weighted scores by dividing by the sum of their weights and multiplying by 100. The item level standardized scores differ from the raw scores by a factor of 100. Weighting only has an effect on the aggregate scores.
• First select regions or sub-regions and then select countries. The entire database can be selected by choosing all regions and countries.
• The aggregate subscores for social, economic, and environmental categories and subscores for coverage and openness elements can be selected for downloading. Aggregates or raw scores and weighted scores are simple sums. Aggregates for standardized scores are weighted averages.
• The Reports page gives access to the ODIN Annual Report, one page country and regional briefs, and other documentation in PDF format.
The Open Data Inventory is a team effort. We are pleased to acknowledge the help of all who contributed to our work.
Open Data Watch
Shaida Badiee, Misha Belkindas, Eric Swanson, Zach Christensen,
Jamison Crowell, Amelia Pittman, Reza Farivari, and Martin Getzendanner
Chandrika Kaul, Amelia Pittman, Jamison Crowell, Maria Vallenilla,
Morgan Smith, Tawheeda Wahabzada, Usman Masood, Mandy Badamkhand, Sophia Rozas, Mariya Fedorchuk, Ela Comanescu, Amira Khalil,
Maissa Khattab, Zach Christensen, and Erik Champenois.
Tim Herzog (World Bank), Martine Durand (OECD), Jon Clifton (Gallup), Geoffrey Greenwell (PARIS21), Jessica Espey (Sustainable Development Solutions Network), Mor Rubinstein (Open Knowledge), Joel Gurin and Laura Manley (Center for Open Data Enterprise)
Website and publication design
District Design Group
“Harvesting Crops” courtesy of the World Bank Photo Collection. Copyright: Flickr/ Curt Carnemark/World Bank
“Reading books by the Chinggis Monument” courtesy of the World Bank Photo Collection. Copyright: Flickr/Khasar Sandag / World Bank
Funding Provided by the William and Flora Hewlett Foundation