Download the printable PDF version
See also the related brochure, Country Profiles and Methodology Report
See also Technical Reports for Latin America and the Caribbean and Asia and the Pacific and Comparisons
Contents
Background and previous studies
Contribution of the current study
Typology of gaps in international and national databases
Microdata sources
Organization of the report
Section 1: Assessment of gaps in gender indicators in international and national databases
Identifying gender indicators
Indicator availability in international and national databases and adherence to standards
Indicator availability by development domain
Causes of gaps
Level of disaggregation
Timeliness and data frequency
Section 2: Microdata inventory
Microdata sources
Overview of least reported indicators in national databases
Indicators that apply specifically to women and girls.
Widely reported indicators
Less available indicators
Sex-disaggregation and household surveys
Administrative data
Other observations on microdata, metadata documentation, and adherence to standards
Section 3. Conclusions, recommendations, and next steps
Main takeaways on indicator availability
Main takeaways on microdata sources
Recommendations
Next steps
Bibliography
Annexes
Annex 2: Indicator assessment methodology
Indicator selection
Domain typology
Country assessments
Notes and observations on the indicator assessment process
Challenges
Background
Challenges and observations during the microdata assessment
Annex 4: Representative Indicator Summary Sheet
Figures
Figure 1 Availability of data in international and national datasets
Figure 2 Gender indicator availability by development domain (for 15 countries)
Figure 3 Timeliness: Most recent year of observation in national databases (%)
Tables
Table 1 Sources of gender data indicators included in the Bridging the Gap Study
Table 2 Number of indicators by development domain
Table 3 Indicators with no data in international and national databases
Table 4: Indicators with data but without sex-disaggregation in international and national databases
Table 5 Data frequency by domain
Table 6 Data frequency by country across all indicators
Table 7 Indicators lacking sex-disaggregated data in national databases
Table 8 Indicators rarely available with sex-disaggregation in national databases
Table 9 Twelve indicators that apply specifically to women and girls in national databases
Table 10 Most frequently sex-disaggregated indicators sourced from microdata in national databases
Table 11 Indicators with limited availability in national databases
Executive Summary
Data2X is committed to improving the quality, availability, and use of gender data in order to make a practical difference in the lives of women and girls worldwide. Open Data Watch seeks to make development data better and more accessible for increased use and impact. Coming together in spring 2018, Data2X and Open Data Watch conceived a study that would offer national statistical offices, international statistical systems, development partners, and others involved in measuring and monitoring the progress of the world’s women and girls a more complete understanding of where gaps in gender data exist, why such gaps occur, and what can be done to fill them. The resulting technical report of this study, Bridging the Gap: Mapping Gender Data Availability in Africa, provides insights into those questions and should help move the development community one step closer to producing high-quality and policy-relevant gender indicators to inform better decisions.
Bridging the Gap assesses the availability of 104 gender-relevant indicators in 15 Sub-Saharan African countries. Gender-relevant indicators measure the status and welfare of women and girls, or, when the indicators are sex-disaggregated reveal pertinent differences between men and women. This list, if produced regularly and to a high standard, represents the information we need to be able to monitor and deliver on at least current commitments across development domains for women and girls. The availability of the indicators were assessed at the international level, national, and microdata levels. Data in international databases have been reported by countries and reviewed by custodian agencies. Data in national databases show whether countries follow methodologies different from those in international sources and highlights national capacities and whether methods employed by countries with sex-disaggregated indicators could be employed in others. Exploration of the microdata – census, survey, or administrative records used to produce the most recent estimate of the indicator – pinpoints the underlying instruments and assesses whether data are in fact being collected but not being further produced and made accessible. By better understanding the production and availability of gender data at these three levels, we can draw specific lessons on how to fill gender data gaps.
The study examines whether the 104 selected indicators were recorded in any form, if they were sex-disaggregated, and whether they reported against additional advised disaggregation such as geographical location, age, income level, or disability status. Indicators were checked for adherence to international standards and timeliness, specifically how recently the indicator was produced and its frequency. This allows examining gender data gaps in four dimensions: availability, granularity, timeliness, and adherence to standards.
The study revealed that 48 percent of gender-relevant indicators are missing or lack sex-disaggregated data in the study countries at both international and national levels. In international databases, 22 percent of the indicators lack any sex-disaggregation and 26 percent are missing data entirely. In national databases there are more missing observations (35 percent) but a smaller proportion – 13 percent – that lack sex-disaggregation. This persistence of large gaps in both international and national databases points to the need for a coordinated effort to improve data collection and adopt common standards for the compilation of indicators.
The study looks at availability of gender data across six development domains: health, education, economic opportunity, political participation, human security, and the environment. The health domain has the highest proportion of sex-disaggregated data, with 73 percent of the indicators sex-disaggregated at the international level. Environment has the lowest proportion of sex-disaggregated data, virtually none. None of the six domains assessed have more than 73 percent availability of sex-disaggregated indicators, showing that even where data availability is highest, significant gender data gaps exist.
Gender data availability varies between international and national databases as well as between countries themselves. There is some data for 74 percent of gender indicators in international databases, versus 65 percent in national databases. In national databases, Kenya and Lesotho (54) produced the fewest gender indicators, while Ghana (83) produced the highest number. Frequency of indicator production is highest in South Africa, where there was an average of 4.9 observations per indicator, and lowest in Ethiopia, with only 1.1. Variations in data availability and capacity to fill data gaps shows that countries make difficult choices in their data production as a result of resource limitation.
The study underlines that administrative sources are a potential wealth of high-quality sex-disaggregated information giving insight into the lives of women and girls that cannot be achieved with surveys. However, in order to play this role, improved documentation and more accessibility is required. Right now, however, internationally sponsored surveys like the Demographic and Health Survey (DHS) and the Multiple Indicators Cluster Survey (MICS) are the most frequently cited sources. This points to an over reliance on these data sources which, while high-quality, carry with them the limitations of any survey exercise.
In addition to the results of the assessments and the findings described in this report, the study produced an expansive dataset that will be used to inform further research and analysis about gender data availability and accessibility.
Introduction
Data gaps are voids in our knowledge of the world and the people and communities who live in it. They restrict our ability to understand the steps we need to take to achieve progress and measure the impact of our work. In the case of gender data, these gaps limit our knowledge of the status and well-being of women and girls in countries around the world. Just as gender data are essential for designing and monitoring programs to improve the well-being of women and girls, knowledge of the location and persistence of gender data gaps is needed to design programs and mobilize resources for filling those gaps.
Data2X and Open Data Watch conducted this study to provide a quantitative assessment of availability of statistical indicators that are of particular relevance to measuring the living conditions of women and girls. The study reports on the availability of 104 gender-relevant indicators, their disaggregations, and the frequency of observations in international and national databases and publications in 15 selected Sub-Saharan African countries. The study also documents the microdata sources (censuses, surveys, and administrative records) used to construct 68 gender-relevant indicators included in the SDGs.
The study results show that on average, sex-disaggregated data are available for only 52 percent of the gender-relevant indicators in the 15 countries studied. However, these gaps are not uniformly distributed: gaps were found in almost every indicator and large gaps exist in every country’s gender statistics. Using the results of this study, we can identify where the gaps are the greatest, and innovative approaches to solutions.
Background and previous studies
In 2014, Data2X published the first comprehensive report on the availability of gender indicators, Mapping Gender Data Gaps. This study noted that “globally, close to 80 percent of countries regularly produce sex-disaggregated statistics on mortality, labor force participation, and education and training. Less than a third of countries disaggregate statistics by gender on informal employment, entrepreneurship, violence against women, and unpaid work (Buvinic et. al. 2014).” Data2X included in their study some of the 52 indicators that comprised the Minimum Set of Gender Indicators proposed by the United Nations Inter-agency and Expert Group on Gender Statistics (UNSC 2013).
Following the publication of the Mapping Gender Data Gaps study, Data2X and Open Data Watch proposed 20 gender indicators that were ready to measure, meaning that the indicators were generally available and/or the necessary microdata sources existed. The study drew on the World Bank’s Gender Data Navigator (GDN) to identify surveys with sufficient data for constructing the indicators (World Bank n.d.). The GDN is an application website built on the International Household Survey Network’s (IHSN) microdata archive that identifies household surveys and censuses that contain gender-relevant topics or indicators.[1]
Contribution of the current study
This study extends the work of Mapping Gender Data Gaps in several directions. First, this study employs an expanded list of 104 indicators comprised of 68 SDG indicators, 27 indicators from the Minimum Set not included in the SDGs, and 9 supplemental indicators proposed by UN Women (UN Women 2017). It also adds a new domain, environment, to the 5 used in the original Mapping Gender Data Gaps work. Finally, it focuses on a limited set of countries, 15 in Sub-Saharan Africa, to enable a deep dive assessment of data at international, national, and microdata levels.
Typology of gaps in international and national databases
|
1. AvailabilityThe study recorded the availability of data in international databases such as the United Nations Global SDG Database and the World Bank’s Gender Data Portal and in national databases and publications. For each indicator and country, the study assessors noted whether indicators were available with sex-disaggregation and the other disaggregations required by the SDGs, the number of observations available between 2010 and 2018, and the location of metadata describing the sources and methods used to construct the indicator. |
|
2. Level of disaggregationThe occurrence of each indicator for each of the 15 study countries was assessed for whether it was fully disaggregated or if it lacked one or more required disaggregations.[2] Indicators that lacked sex-disaggregation were recorded separately. |
![]() |
3. Timeliness and frequencyIndicators were assessed for their timeliness and frequency. Timeliness was measured from the date of the most recent observation and frequency by the number of observations available over the period of 2010 to 2018. |
|
4. Adherence to standardsAdherence to international standards is documented by the inventory of metadata recorded as part of the assessments. |
Microdata sources
The study links the indicators to their microdata sources and provides a summary data page offering a description of the indicator, documentation of the indicator produced by each country, and its microdata sources. The information recorded during the indicator assessments was used to identify the censuses, surveys, or administrative records used to construct the indicators found in national databases. Survey questionnaires were examined as needed to clarify sources and the availability of disaggregations.
Organization of the report
The report is organized into four main sections. The first section describes indicator availability, level of disaggregation, and frequency in international and national databases. The second section traces the microdata sources for the 68 gender-relevant indicators found in the SDGs and illustrates the availability of data with and without sex-disaggregation for each indicator. Section 3 summarizes the results of the study, offers recommendations, and proposes next steps to fill gender data gaps. The final section is comprised of four annexes that list the study indicators, describe the methodology of the indicators assessments, describe the methodology of the microdata assessments, and give an example of an indicator summary sheet.
Section 1: Assessment of gaps in gender indicators in international and national databases
Identifying gender indicators
This research dataset contains 104 gender-relevant indicators identified by United Nations Statistics Division (UNSD), UN Women, or are included in the Sustainable Development Goals (SDGs).[3] If data were available for each of these indicators, this would provide a portion of the information we need to be able to monitor and deliver on current commitments for women and girls.
Where did the 104 indicators come from?
In 2013 the Inter-agency and Expert Group on Gender Statistics (IAEG-GS) proposed a Minimum Set of Gender Indicators endorsed by the 44th session of the UN Statistical Commission (UNSC). The minimum set includes 52 indicators drawn from datasets maintained by UN agencies (UNSC 2013). Subsequently, the Inter-agency and Expert Group on the Sustainable Development Goals (IAEG-SDG) proposed a set of 232 indicators to measure the 17 SDGs. UN Women identified 54 SDG indicators that are “specifically or largely targeted” at women or girls (UN Women 2017). However, a number of these are among the SDG indicators that were classified by the IAEG-SDG as Tier 3 indicators because they lack agreed upon methodology and not available for most countries (IAEG-SDG 2018). Because of this, Tier 3 indicators were not included in this research dataset though much more work is needed to develop their methodologies. The Tier 1 (well-defined and produced by more than half of all countries) and Tier 2 (well-defined but less frequently produced) SDG indicators proposed by UN Women are included in the research dataset. The remaining indicators from the Minimum Set that have agreed upon methodologies and have been produced regularly in at least some countries are also included.
UN Women noted that “a less restrictive criteria where all indicators that are relevant for women and girls and can be disaggregated by sex are included would yield a greater listing of gender-relevant indicators.” (UN Women 2018). Accordingly, Open Data Watch conducted a more detailed assessment and identified 36 additional Tier 1 and Tier 2 SDG indicators that are commonly published with sex-disaggregation or might be at a future date. More recently, UN Women proposed supplemental indicators to ensure that there exists at least one indicator for each of the 17 SDGs (UN Women 2018). Open Data Watch selected nine of these indicators to include in the research dataset.
Table 1 Sources of gender data indicators included in the Bridging the Gap Study
Origin of indicators and availability |
Number of indicators |
Share |
UN Women SDG Gender Indicators |
32 |
29.8 |
Additional SDG Gender Indicators |
36 |
34.6 |
Minimum Set |
27 |
26.9 |
UN Women Supplemental Indicators |
9 |
8.7 |
Total |
104 |
100.0 |
Indicator availability in international and national databases and adherence to standards
The international databases studied are those maintained by designated custodian agencies such as the WHO, UNICEF, ILO, or compilations maintained by the World Bank and the UNSD. National data covered by the study included databases in online data retrieval systems, online publications of national statistical offices or other government agencies, and nationally published research findings.
For each indicator and each country, study assessors noted whether data for the selected indicators were available in one or more years between 2010 and 2018 and whether the indicators were available with sex-disaggregated data and included other disaggregations specified in their original description. The results of this assessment were recorded separately for international and national databases. Among the 104 indicators covered, the study included 12 indicators that apply only to women, such as maternal mortality, all of which are counted in the same category with sex-disaggregated indicators.
Indicators that did not conform to the published standard in the SDG metadata (United Nations 2018) or did not match the technical description of the listed indicator but provided relevant information were recorded separately as “related” indicators. These indicators measure an event or condition similar to the designated indicators and, in many cases, may better reflect national systems and users’ needs. In some cases, the differences are only in the choice of the normalizing value such as publishing the proportion of unemployed persons in the population instead of the labor force.
Figure 1 Availability of data in international and national datasets
Figure 1 shows the proportion indicators available in international and national databases. In international databases, 44 percent of the indicators are available with complete disaggregation, while in national databases, this proportion is just 32 percent. Both national and international databases have a small share of indicators that lack some specified disaggregation but include sex-disaggregation. National databases contain a larger share of related, non-standard indicators with sex-disaggregation. 22 percent of the possible observations in international databases lack sex-disaggregation and an additional 26 lack any data at all. In national databases, 12 percent of the indicators lack sex-disaggregation and 35 percent are missing entirely. In total, sex-disaggregated data for gender-relevant indicators are unavailable in any year for 48 percent of the possible observations in international or national databases.
Indicator availability by development domain
Each of the 104 indicators can be classified according to 6 domains of development: health, education, economic opportunity, political participation, human security, and environment.[4] The number count of indicators in each domain and their share of the total alongside the share of indicators with sex-disaggregation, averaged over the 15 countries, are shown in Table 2.
Table 2 Number of indicators by development domain
|
Count of indicators |
Indicators with sex-disaggregation |
||
Domain |
Total |
Share (%) |
National databases (%) |
International databases (%) |
Economy |
23 |
22.1 |
22.1 |
19.4 |
Education |
22 |
21.2 |
24.0 |
22.1 |
Environment |
8 |
7.7 |
0.9 |
0.0 |
Health |
32 |
30.8 |
34.7 |
42.9 |
Human security |
12 |
11.5 |
11.0 |
8.0 |
Public participation |
7 |
6.7 |
7.3 |
7.6 |
Total |
104 |
100 |
100 |
100 |
Health is the largest domain with 32 indicators, followed by economy, and education. The incidence of sex-disaggregated indicators found in each domain differs between national and international databases, with the greatest difference occurring in health and the smallest in public participation. The domain with the fewest indicators with sex-disaggregated data was environment. National databases are more likely to have sex-disaggregated data in the environment, economy, education, and human security domains. Environmental indicators are the least available, with no sex-disaggregated data found in international databases. The study revealed that indicators that depend on household-level data often cannot be disaggregated by sex.
The number of indicators with and without sex-disaggregation is shown in Figure 2. To simplify this presentation, all sex-disaggregated indicators are grouped together, and indicators that lack sex-disaggregation are grouped with those lacking any data. The count of indicators is taken across all 15 countries in the study.
Figure 2 Gender indicator availability by development domain (for 15 countries)
Causes of gaps
The study identified gaps in sex-disaggregated data in both international and national databases, including two types of absolute gaps: indicators with no recorded data in any of the 15 study countries and indicators with data but no sex-disaggregated data. Absolute gaps are more common in international databases but there are more indicators with complete coverage – indicators with at least one observation in all countries – in international databases (20) than in national databases (5). This suggests that notwithstanding the broad scope of the SDGs, international reporting is highly specialized and focused on a limited set of indicators.
No data
Table 3 lists nine indicators that have no published data for any of the 15 countries in international databases and two more that are available in international databases but lack data in national databases. These are spread across all domains in the study. The two education indicators (4.1.X6 and 4.1.X10) are supplemental indicators that already have similar measures in the SDG indicator framework. The remaining seven are SDG indicators that are classified by the IAEG-SDG as Tier 2, meaning they have agreed upon methodologies but are not yet generally available. Of those seven, the two missing SDG indicators from the human security domain 5.2.2 and 16.3.1 (proportion of women subjected to sexual violence and feel safe walking alone) require data from victimization surveys or well-functioning administrative systems as does SDG indicator 16.5.1 (Proportion of persons who had at least one contact with a public official and who paid a bribe to a public official). The environment SDG indicator 11.2.1 (access to transportation) is difficult to measure: the SDG methodology suggests using GIS data accompanied by a specialized survey to determine how close individuals live to public transportation. Data for economic SDG indicator 8.5.1 (Average hourly earnings of female and male employees, by occupation, age and persons with disabilities) are available for 64 countries in the SDG database, but most are from high income or upper-middle income countries and none are from Africa.
Table 3 Indicators with no data in international and national databases
Indicator number |
Indicators unavailable in international databases |
Domain |
4.1.X10 |
Proportion of women with less than a high school diploma |
EDUC |
4.1.X6 |
Proportion of women with six or less years of education |
EDUC |
4.3.1 |
Participation rate of youth and adults in formal and non-formal education and training in the previous 12 months, by sex |
EDUC |
5.2.2 |
Proportion of women (aged 15-49) subjected to sexual violence by persons other than an intimate partner, since age 15* |
HUMN |
8.5.1 |
Average hourly earnings of female and male employees, by occupation, age and persons with disabilities |
ECON |
11.2.1 |
Proportion of population that has convenient access to public transport, by sex, age and persons with disabilities |
ENVT |
16.1.4 |
Proportion of population that feel safe walking alone around the area they live |
HUMN |
16.3.1 |
Proportion of victims of violence in the previous 12 months who reported their victimization to competent authorities or other officially recognized conflict resolution mechanisms |
HUMN |
16.5.1 |
Proportion of persons who had at least one contact with a public official and who paid a bribe to a public official, or were asked for a bribe by those public officials, during the previous 12 months |
PART |
|
Indicators unavailable in national databases |
|
1.1.1 |
Proportion of population below the international poverty line, by sex, age, employment status and geographical location (urban/rural) |
ECON |
3.9.1 |
Mortality rate attributed to household and ambient air pollution |
ENVT |
Only two indicators are entirely without data in national databases. SDG Indicator 1.1.1 (poverty rate at the international poverty line) is available for 13 countries in international databases but is not reported by countries. Calculation of SDG indicator 3.9.1 (pollution mortality rate) requires use of international tables of risk factors combined with locally gathered exposure data. Like the international poverty rate, there are observations for all 15 countries in international databases, but none are sex-disaggregated.
Level of disaggregation
Granularity refers to the smallest units of observation or disaggregation available for each indicator. The SDGs include indicators at different levels of granularity while the descriptions of some indicators have no suggested disaggregations. Still others specify multiple disaggregations, including sex, age, employment status, geographical location, disability status, pregnant women, the poor and vulnerable, work-injury victims, education levels, indigenous status, enterprise size, and others.[5]
The combined research dataset includes indicators whose description specifies disaggregation by sex (or apply only to women) and others for which sex-disaggregation would be relevant but is not specified. The research team noted whether the indicator definition specifies other disaggregations. Indicators that included all the required disaggregations were classified accordingly and distinguished from those that included sex-disaggregation but lacked one or more of the other specified disaggregations.
Table 4 lists 16 indicators that are available in international databases for one or more countries but lack sex-disaggregation in at least one country. There are three additional indicators in national databases that have at least one observation but lack sex-disaggregation while two of these also lack sex-disaggregation in international databases.
Table 4: Indicators with data, without sex-disaggregation in international and national databases
Indicator number |
Indicators available without sex-disaggregation in international databases |
Domain |
1.2.2 |
Proportion of men, women and children of all ages living in poverty in all its dimensions according to national definitions |
ECON |
1.5.1 |
Number of deaths, missing persons and directly affected persons attributed to disasters per 100,000 population |
ENVT |
2.1.1 |
Prevalence of undernourishment |
HEAL |
3.3.Y |
Access to anti-retroviral drug, by sex |
HEAL |
3.3.3 |
Malaria incidence per 1,000 population |
HEAL |
3.6.1 |
Death rate due to road traffic injuries |
HEAL |
3.3.4 |
Hepatitis B incidence per 100,000 population |
HEAL |
3.3.5 |
Number of people requiring interventions against neglected tropical diseases |
HEAL |
3.9.2 |
Mortality rate attributed to unsafe water, unsafe sanitation and lack of hygiene |
ENVT |
4.6.1 |
Proportion of population in a given age group achieving at least a fixed level of proficiency in functional (a) literacy and (b) numeracy skills, by sex |
EDUC |
5.b.1 |
Proportion of individuals who own a mobile telephone, by sex |
ECON |
6.1.1 |
Proportion of population using safely managed drinking water services |
ENVT |
6.2.1 |
Proportion of population using safely managed sanitation services, including a hand-washing facility with soap and water |
ENVT |
10.1.1 |
Growth rates of household expenditure or income per capita among the bottom 40 per cent of the population and the total population |
ECON |
16.1.3 |
Proportion of population subjected to physical, psychological or sexual violence in the previous 12 months |
HUMN |
16.3.2 |
Unsentenced detainees as a proportion of overall prison population |
HUMN |
Indicators available without sex-disaggregation in national databases |
||
16.5.1 |
Proportion of persons who had at least one contact with a public official and who paid a bribe to a public official, or were asked for a bribe by those public officials, during the previous 12 months |
PART |
Indicators available without sex-disaggregation in international and national databases |
||
3.9.1 |
Mortality rate attributed to household and ambient air pollution |
ENVT |
7.1.X |
Proportion of women with access to clean cooking fuel |
ENVT |
11.1.1 |
Proportion of urban population living in slums, informal settlements or inadequate housing |
ENVT |
Only three indicators (1.2.2 – population living in poverty in all its dimensions, 5.b.1 – proportion of individuals owning a mobile phone, and 3.3.Y – access to anti-retroviral drug) specify disaggregation by sex. The remaining indicators were included by UN Women or Open Data Watch as measures that might be disaggregated by sex. Although not available in international databases, sex-disaggregated data are available in some national databases. In addition, the SDG metadata for many environment indicators suggest disaggregation by sex, but additional household level data are required to complement incidence estimates derived from administrative sources to derive these indicators. More actions are needed to develop the underlying data. Household data obtained from surveys could, for example, easily determine how many women have access to clean cooking fuels (3.3.Y) and, how many children are living in households that lack clean cooking fuels. The closely related SDG indicator of mortality rates attributed to ambient air pollution (3.9.1) should be obtained by well-functioning vital statistics systems that record cause of death adequately for men and women.
The six health indicators that lack sex-disaggregated data in international databases should be available from administrative records of the health system, except for SDG indicator 2.1.1 (prevalence of undernourishment), which is based on aggregate data for which sex- disaggregation is not possible. The research database records at least one country that has published sex-disaggregated data for each of these indicators. This suggests that international reporting is not taking full advantage of all of the national level data that are available But many gaps remain at the national level, indicating a continuing need to upgrade the monitoring and reporting mechanisms for these and other indicators.
Timeliness and data frequency
Timeliness refers to the gap between the reference year of an observation and the time when it becomes available to data users. Inevitably, time lags exist between the date when data are recorded and the date of publication. However, with online data it is not apparent when an observation first appeared, which means that the study only records the reference year and not the date of publication of each observation between 2010 and 2018.
Looking at the most recent year of observation, the largest number of observations in the international databases occurred in 2016, with only Rwanda recording its modal value in 2015. This tracks closely with results for national databases, shown in figure 3. Botswana has the oldest data, with a median year of 2011; and South Africa has the most recent with a median year of 2017. In national databases, the least timely indicator was life expectancy at age 60, which had a median value of 2012 for countries reporting data. In international databases, the least timely indicator was the poverty rate at the international poverty lines with a median value of 2012.
Figure 3 Timeliness: Most recent year of observation in national databases (%)
The frequency with which data are available is measured by the number of times the indicator is observed between 2010 and 2018. Table 5 shows the average number of observations per indicator for those indicators that had at least one non-zero value.
The average number of non-zero observations across all domains tells us that data, when they are available, occur with higher frequency in international databases. Put another way, indicators for which there are any data are likely to have more data over the time period in international databases. The domain with the highest frequency is health in international databases. The average of 3.7 observations per indicator shows that observations are available for more than half the years between 2010 and 2018. The high frequency of health indicators in international databases may reflect interpolated or modeled estimates by custodian agencies. Education statistics, which are commonly based on administrative data reported by ministries of education, are the most frequently reported category of data in national databases.
Table 5 Data frequency by domain
|
International databases |
National databases |
||
Domain |
Total observations |
Average number of observations per indicator |
Total observations |
Average number of observations per indicator |
Economy |
888 |
3.5 |
425 |
1.8 |
Education |
513 |
2.8 |
580 |
2.8 |
Environment |
370 |
3.7 |
136 |
2.2 |
Health |
1796 |
3.9 |
563 |
1.7 |
Human security |
150 |
1.7 |
197 |
1.9 |
Public participation |
240 |
3.8 |
168 |
2.6 |
Total |
3957 |
3.5 |
2069 |
2.1 |
The availability of data differs significantly between countries. Table 6 shows the number and average frequency of non-zero observations for the 15 study countries. This assessment shows that no country has published data for all 104 indicators, but the difference between the highest and lowest is equivalent to more than 20 indicators. International databases provide the greatest number of non-zero observations with the highest average frequency. Tanzania has the greatest number of non-zero observations in international databases, but South Africa has the highest frequency. In national databases, Nigeria has the largest number of non-zero observations, and South Africa, again, has the highest frequency.
Table 6 Data frequency by country across all indicators
International databases |
National databases |
|||
Country |
Number of observations |
Average frequency of observations |
Number of observations |
Average frequency of observations |
Botswana |
61 |
4.0 |
63 |
1.8 |
Côte d’Ivoire |
81 |
3.1 |
63 |
1.3 |
Ethiopia |
78 |
3.6 |
65 |
1.1 |
Ghana |
82 |
3.3 |
83 |
1.3 |
Kenya |
70 |
3.2 |
54 |
2.6 |
Lesotho |
70 |
3.9 |
54 |
1.9 |
Malawi |
75 |
3.6 |
78 |
1.8 |
Nigeria |
76 |
3.5 |
80 |
2.0 |
Rwanda |
81 |
3.4 |
78 |
3.2 |
Senegal |
82 |
3.9 |
56 |
2.5 |
South Africa |
75 |
4.1 |
61 |
4.9 |
Tanzania |
85 |
3.3 |
67 |
1.8 |
Uganda |
83 |
3.3 |
59 |
1.8 |
Zambia |
68 |
3.7 |
79 |
2.2 |
Zimbabwe |
80 |
3.3 |
71 |
1.2 |
Average all countries |
76 |
3.5 |
67 |
2.1 |
Section 2: Microdata inventory
In Section 1 of this report, the availability of 104 gender-relevant indicators in national and international databases was covered. In Section 2, the report covers the findings of the study examining the microdata sources of the 68 gender-relevant SDG indicators located in national databases and creating a metadata inventory that complements the work done in section 1. Wherever possible, the research team identified the census, survey or administrative records used to produce the most recent estimate of the indicator as well as information on earlier or alternative sources. The location of metadata of censuses and surveys cataloged by the International Household Survey Network (IHSN) or in national data archives was also recorded. More details of the methodology used in Section 2 of the study are described in Annex 3.
The goal of Section 2 is to create a record of the availability of the microdata used to produce each indicator, to identify systematic reasons for gaps in the statistical record, and to determine the means of filling those gaps. Because of the lag time between data collection and the appearance of an indicator based on survey data and the additional lag in posting information about the survey, the study found little information about surveys conducted after 2015. In some cases, the existence of more recent surveys was discovered by examining the IHSN catalog where there are also delays in posting survey metadata.
Microdata sources
The microdata sources most frequently used to construct gender indicators were Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS). Across all 15 countries, 36 indicators were sourced from DHS, although no single country attributed more than 28 indicators to DHS. Among the 19 most frequently available sex-disaggregated indicators, at least 12 were primarily sourced to DHS or MICS. These survey programs are sponsored by USAID and UNICEF and have played a major role in establishing a core set of data for women and children, especially in the health and education domains. Most DHS and MICS datasets are documented in the International Household Survey Network (IHSN) archive and are available as public use files, which made this assessment possible and further underscores the importance of making data open and available to the public.
This study also revealed that other frequently cited sources were labor force surveys, censuses, and national living condition or welfare monitoring surveys. The last group, which are often based on the World Bank’s Living Standard Measurement Surveys (LSMS), provide data on household income or consumption needed to construct poverty indicators and other household-based measures such as access to water and sanitation. Many surveys and censuses are documented in the IHSN archive, where it is possible to consult the original questionnaires. However, aside from access to metadata documentation, access to the micro datasets often requires an application and permission from the owner.
Administrative records were most frequently cited as sources of education statistics and for specialized indicators such as crime and victimization, traffic deaths, and mortality rates (such as suicide, poisoning, or unsafe water) where reporting of cause of death is needed. Where administrative records were not available, survey results were sometimes used, but many countries that lack well-functioning civil registration and vital statistics systems do not report on these indicators. Access to documentation of administrative data remains the largest gap in the microdata sources for gender statistics. The study found no instances of public access to administrative records or their metadata and, in many cases, documentation of indicators referred only to the ministry or agency that produced the indicators with no information about the specific dataset deployed.
Botswana is the only country in the study that has not conducted a DHS or MICS survey, although it did receive funding from UNICEF for its Family Health Survey. As a consequence, Botswana makes greater use of data from its 2011 Population and Housing Census and from administrative records of the education and health systems. However, because censuses are conducted only once in a decade, they serve as important reference points but cannot provide timely data. Not surprisingly, Botswana has the lowest median date of data availability among the fifteen study countries.
While administrative records have the advantage of providing continuous reporting, in some cases the derived indicators do not conform to the definition of the SDG indicator. For example, Botswana reports the location of a birth (home or hospital) and birth outcome based on vital statistics records. In the study assessment these data are recorded as related to SDG indicator 3.1.2 (births attended by skilled health personnel), but they do not provide a direct measure of the qualifications of the health personnel involved.
Overview of least reported indicators in national databases
Thirty of the 104 indicators studied were not sex-disaggregated for 11 or more countries in national databases. The 12 female-specific indicators, such as maternal mortality, were not among these 30 and were found to have better representation and were reported more often. The largest number of indicators in this group came from health (8) and the second largest came from the environment. Four indicators with no sex-disaggregated data available in national databases are listed in table 7.
Table 7 Indicators lacking sex-disaggregated data in national databases
Indicator number |
Domain |
Indicator |
|
Available not sex-disaggregated |
|
1.1.1 |
ECON |
Proportion of population below the international poverty line, by sex, age, employment status and geographical location (urban/rural) |
0 |
0 |
|
3.9.1 |
ENVT |
Mortality rate attributed to household and ambient air pollution |
0 |
0 |
|
11.1.1 |
ENVT |
Proportion of urban population living in slums, informal settlements or inadequate housing |
0 |
8 |
|
16.5.1 |
PART |
Proportion of persons who had at least one contact with a public official and who paid a bribe to a public official, or were asked for a bribe by those public officials, during the previous 12 months |
0 |
3 |
Indicator 1.1.1 (poverty rate at international poverty line) is not reported by any country, though data for 13 countries are available from international databases. Indicators 1.1.1, 3.9.1 (pollution mortality rate) and 11.1.1 (population living in slums) all depend on household-level data whose intra-household distribution cannot be determined. The ILO publishes sex-disaggregated data on poverty by employment status, but these are based on the number of men and women in poor households rather than measurements of individual poverty status. Indicator 16.5.1 (persons reporting a bribe) could be disaggregated by sex if the appropriate data were collected, but data are available for only three countries and only one (South Africa) uses a crime victimization survey.
There are three more indicators shown in table 8 that are available in a majority of countries in national databases but lack sex-disaggregation in most. All of these indicators record conditions shared by all members of a household. Where the indicators are disaggregated by sex, the disaggregation is based on the sex of the head of household, not necessarily women. Because of cultural conventions that often attribute the status of head of household to the senior male resident, these measures are likely to underestimate the proportion of women affected. The limitations of household data are discussed further below.
Table 8 Indicators rarely available with sex-disaggregation in national databases
Indicator number |
Domain |
Indicators |
Available with sex-disaggregation |
Available without sex-disaggregation |
6.2.1 |
ENVT |
Proportion of population using safely managed sanitation services, including a hand-washing facility with soap and water |
1 |
14 |
6.1.1 |
ENVT |
Proportion of population using safely managed drinking water services |
2 |
13 |
1.2.1 |
ECON |
Proportion of population living below the national poverty line, by sex and age |
3 |
11 |
Indicators that apply specifically to women and girls.
Twelve SDG indicators are defined as applying specifically to women or girls. Eight of the 12 indicators are among the 19 most widely reported indicators, and all eight are sourced to DHS or MICS surveys. Data on women in parliament (5.5.2) come from administrative sources. Data on women in local governments are generally not available and the SDG metadata provides no information or guidance for collecting these data. Data for indicator 5.a.1 (women’s ownership of agricultural land) are sourced to a limited number of surveys, most of which do not provide adequate information to produce the defined indicator. Only Nigeria’s 2015 Panel Survey and Zimbabwe’s 2014 Agricultural and Livestock Survey asked specifically about the sex of each landowner in the household. The other six countries with data were assessed as providing a related indicator, although in some cases the data recorded ownership of non-agricultural land. Indicator 5.5.2 (women in management) generally comes from labor force surveys (LFS), although some countries reported data from DHS.
Table 9 Twelve indicators that apply specifically to women and girls in national databases
Indicator number |
Domain |
Indicator |
Available with sex-disaggregation |
3.1.1 |
HEAL |
Maternal mortality ratio |
15 |
3.1.2 |
HEAL |
Proportion of births attended by skilled health personnel |
15 |
3.7.1 |
HEAL |
Proportion of women of reproductive age (aged 15-49 years) who have their need for family planning satisfied with modern methods |
14 |
3.7.2 |
HEAL |
Adolescent birth rate (aged 10-14 years; aged 15-19 years) per 1,000 women in that age group* |
15 |
5.2.1 |
HUMN |
Proportion of ever-partnered women and girls aged 15 years and older subjected to physical, sexual or psychological violence by a current or former intimate partner in the previous 12 months, by form of violence and by age |
13 |
5.2.2 |
HUMN |
Proportion of women (aged 15-49) subjected to sexual violence by persons other than an intimate partner, since age 15* |
10 |
5.3.1 |
HUMN |
Proportion of women aged 20-24 years who were married or in a union before age 15 and before age 18 |
15 |
5.3.2 |
HUMN |
Proportion of girls and women aged 15-49 years who have undergone female genital mutilation/cutting, by age |
8 |
5.5.1 |
PART |
Proportion of seats held by women in (a) national parliaments and (b) local governments† |
11 |
5.5.2 |
PART |
Proportion of women in managerial positions |
14 |
5.6.1 |
HEAL |
Proportion of women aged 15-49 years who make their own informed decisions regarding sexual relations, contraceptive use and reproductive health care |
10 |
5.a.1 |
ECON |
(a) Proportion of total agricultural population with ownership or secure rights over agricultural land, by sex; and (b) share of women among owners or rights-bearers of agricultural land, by type of tenure |
1 |
Widely reported indicators
Eighteen of the 68 SDG indicators in the study dataset are available with sex-disaggregation in 12 or more countries. As in the previous discussion, these include indicators that are related to but do not match the specific definition of the corresponding SDG indicator. Seven are included as indicators specific to women and girls in table 9 and the remaining 11 are shown in Table 10.
Table 10 Most frequently sex-disaggregated indicators from microdata in national databases
Indicator number |
Domain |
Indicators |
Available with sex-disaggregation |
Available without sex-disaggregation |
8.5.2 |
ECON |
Unemployment rate, by sex, age and persons with disabilities |
14 |
1 |
8.7.1 |
ECON |
Proportion and number of children aged 5-17 years engaged in child labor, by sex and age |
14 |
1 |
9.2.2 |
ECON |
Manufacturing employment as a proportion of total employment |
14 |
0 |
4.6.1 |
EDUC |
Proportion of population in a given age group achieving at least a fixed level of proficiency in functional (a) literacy and (b) numeracy skills, by sex |
12 |
1 |
2.2.2 |
HEAL |
Prevalence of malnutrition (weight for height >+2 or <-2 standard deviation from the median of the WHO Child Growth Standards) among children under 5 years of age, by type |
14 |
0 |
3.2.1 |
HEAL |
Under-five mortality rate |
14 |
1 |
3.2.2 |
HEAL |
Neonatal mortality rate |
13 |
1 |
3.a.1 |
HEAL |
Age-standardized prevalence of current tobacco use among persons aged 15 years and older |
13 |
0 |
2.2.1 |
HEAL |
Prevalence of stunting (height for age <-2 standard deviation from the median of the WHO Child Growth Standards) among children under 5 years of age |
13 |
0 |
16.1.3 |
HUMN |
Proportion of population subjected to physical, psychological or sexual violence in the previous 12 months |
12 |
0 |
16.9.1 |
PART |
Proportion of children under 5 years of age whose births have been registered with a civil authority, by age |
13 |
2 |
The three most commonly available indicators are from the economic domain and, with few exceptions, are calculated from labor force surveys. Nigeria, which has no labor force surveys recorded in the IHSN catalog, calculates the disaggregated unemployment rate (8.5.2) from its General Household Survey but does not publish disaggregated data. Other countries make use of the national census or MICS. The education indicator on functional literacy and numeracy (4.6.1) depends on survey data, but only Ethiopia, for which the microdata source could not be determined, publishes an indicator matching the exact description of the SDG indicator. The five health indicators all come from DHS surveys or similar national health surveys, as does information on registered births (16.9.1). The proportion of people subjected to violence (16.1.3) uses data from DHS or victimization surveys; in six of the 12 countries with data, only data for women are available.
Less available indicators
There are 14 indicators shown in Table 11 for which sex-disaggregated data are available in 5 to 11 countries in national databases. Those that can be obtained from DHS or labor force surveys are the most commonly available, including the proportion of young people experiencing sexual violence (16.2.3) and the three most available economic indicators. Indicator 5.4.1 (time spent on unpaid domestic and care work) is noteworthy because only Rwanda and Zimbabwe report complete data from time-use studies. The other five countries with related data report on time spent on household chores or the number of unpaid family workers. Nevertheless, the international SDG database reports fully disaggregated data for Ethiopia, Tanzania, and South Africa, all calculated by the UN Statistics Division from time-use modules attached to labor force surveys.
Indicators sourced primarily from administrative records, including many education and health indicators, are problematic because published sources contain little information about the provenance of the underlying data.
Table 11 Indicators with limited availability in national databases
Indicator number |
Domain |
Indicators |
Available with sex-disaggregation |
Available without sex-disaggregation |
16.2.3 |
HUMN |
Proportion of young women and men aged 18-29 years who experienced sexual violence by age 18 |
11 |
0 |
8.3.1 |
ECON |
Proportion of informal employment in non-agriculture employment, by sex |
10 |
0 |
8.6.1 |
ECON |
Proportion of youth (aged 15-24 years) not in education, employment or training |
10 |
1 |
8.5.1 |
ECON |
Average hourly earnings of female and male employees, by occupation, age and persons with disabilities |
9 |
1 |
4.c.1 |
EDUC |
Proportion of teachers in: (a) pre-primary; (b) primary; (c) lower secondary; and (d) upper secondary education who have received at least the minimum organized teacher training |
9 |
1 |
17.8.1 |
ECON |
Proportion of individuals using the Internet |
7 |
8 |
5.4.1 |
ECON |
Proportion of time spent on unpaid domestic and care work, by sex, age and location |
7 |
0 |
4.2.2 |
EDUC |
Participation rate in organized learning (one year before the official primary entry age), by sex |
7 |
0 |
5.b.1 |
ECON |
Proportion of individuals who own a mobile telephone, by sex |
6 |
9 |
3.3.1 |
HEAL |
Number of new HIV infections per 1,000 uninfected population, by sex, age and key populations |
6 |
3 |
3.5.2 |
HEAL |
Harmful use of alcohol defined according to the national context as alcohol per capita consumption (aged 15 years and older) within a calendar year in liters of pure alcohol |
6 |
0 |
3.3.3 |
HEAL |
Malaria incidence per 1,000 population |
6 |
5 |
16.1.1 |
HUMN |
Number of victims of intentional homicide per 100,000 population, by sex and age |
6 |
6 |
3.4.1 |
HEAL |
Mortality rate attributed to cardiovascular disease, cancer, diabetes or chronic respiratory disease |
5 |
4 |
Sex-disaggregation and household surveys
Out of the 30 indicators with the lowest availability of sex-disaggregated data, nine depend exclusively on collective characteristics of the household that cannot be assigned to one resident or another. Two others, indicator 1.2.2 (multidimensional poverty) and 1.3.1 (coverage of social protection programs) involve data that could be sex-disaggregated but mix those data with collective data on characteristics of the household or participation on safety net programs. For these indicators, the lack of an agreed upon methodology for identifying differences in incidence for women and other vulnerable groups undermines the promise of “leave no one behind.”
A straightforward way to account for differences in the frequency of men and women (or other groups) in collective household data is to count their numbers and compare their distribution to the general population. Thus, one might ask whether women or men make up the majority of residents in households whose average income or consumption puts them below the national or international poverty line. Because the proportion of men and women in the general population is not identical, a correction should be made for prior probabilities. Alternatively, the same data can be used to calculate the proportion of all women who live in poor households and compare that with men. Further disaggregations can be made on the basis of age, ethnicity or disability status if the corresponding data are available for the general population.
While these calculations provide an aggregate measure of the disproportionate experience of poverty, they have limited value in targeting programs to vulnerable people, particularly those who live in households not identified as poor. There is clearly a need for more finely targeted questions that record individual responses, probe more deeply on individuals’ levels of consumption or control over household assets, and document conditions within households that affect the health and safety of women, men, and children. This is already done by DHS and MICs and many national living condition surveys. The World Bank’s 2018 Poverty and Shared Prosperity Report (World Bank, 2018) gives examples of how inequalities in consumption can be modeled using modest amounts of additional data.
Administrative data
Unlike censuses and surveys, for which standards for documentation such as the Data Documentation Initiative (DDI) are widely used, the existence of most administrative records can only be inferred by the indicators produced from them. Although it is common to provide the public with access to files from censuses and surveys, similar access points are not generally available for microdata from education and health information systems or civil registration systems. Statistical yearbooks, for example, routinely provide counts of the total number of students enrolled in school, sometimes by grade and usually by sex, but systems that provide school-by-school details of facilities, staffing, and student progress are rare. No such data was found in the 15 study countries. This is a significant gap that deprives the public of valuable information and inhibits the development of innovative applications that could benefit from this important data.
The study shows that indicators derived from administrative databases tend to be non-standard because of differences in institutional structures, reporting mechanisms, definitions and the completeness of records used to construct the indicators. As a result, indicators based on administrative sources may lack comparability between countries and over time. The UNESCO Institute of Statistics adjusts the education data it obtains from countries to account for differences in the grade structures of school systems, which is one of many sources of national differences in education data. Similarly, WHO maintains standards for the classification of diseases and a verbal autopsy protocol for reporting the cause of death, but these are not consistently applied. And in most of the countries in this study, health indicators are based only on reported cases at health centers, which may require correction for selection or reporting biases. Similarly, early estimates of the HIV/AIDS epidemic in Africa were based on data collected at sentinel sites and proved to be significantly biased once biomarkers collected through household surveys became available to establish the greater impact of HIV/AIDS on women.
Administrative records provide more complete data than surveys and do so at higher frequencies. However, in order to become useful sources of reliable statistics, these records need to be well-documented and standardized to facilitate comparisons between countries and over time. Furthermore, standards for anonymization of data are needed to permit the open dissemination of administrative records for public use.
Other observations on microdata, metadata documentation, and adherence to standards
As a starting point, the study compared the published metadata for the SDG indicators to the metadata available for equivalent national indicators. Although the study did not attempt to match indicators to specific survey records, questionnaires available in the IHSN were consulted to clarify sources and the availability of sex-disaggregated data. In some instances, sex-disaggregated microdata exist even when the published indicator has only a combined value. The review of national indicators uncovered many discrepancies between the SDG metadata specifications and those of national indicators. For example, some surveys ask attitudes and perception rather than experience. While four SDG indicators (5.2.1 – women subjected to violence by intimate partner, 5.2.2 – women subjected to sexual violence by non-intimate partner, 16.1.3 – population subjected to violence, and 16.2.3 – young women and men experienced sexual violence by 18) ask about the experience of forms of violence, the MICS 2016 survey in Nigeria, which has questions on attitudes towards domestic violence, does not question respondents on whether they have been subjected to such violence.
This reflects a difference between the SDG target and indicator and existing practice in the country as well as the limitations of the SDG metadata files. Many of these metadata files are more concerned with the processing of indicators by the custodian agencies for international databases than with providing practical guidance on the collection of microdata and construction of the national level indicators. More complete documentation, including methods of data collection and indicator computation stressing the importance of disaggregation, would encourage standardization of the reported SDG indicators and increase their availability.
Section 3. Conclusions, recommendations, and next steps
Main takeaways on indicator availability
Conclusions in brief:
- Large gender data gaps exist in national and international databases and action must be taken to fill these gaps.
- None of the six topical domains assessed in this study had more than 73 percent availability of sex-disaggregated data, showing that even where data availability is highest, significant gender data gaps exist.
- The study identified indicators for which the most recent observations were old enough to only be of historical interest, underscoring the importance of regular updates and timely release of data to support planning, monitoring, and decision making.
- Variations in data availability and capacity to fill data gaps shows that countries make difficult choices in their data production decisions as a result of resource limitations.
- Systems producing microdata play a significant role in providing raw data for gender indicators. Countries are dependent on internationally sponsored surveys to generate microdata that is relevant to gender-relevant SDG indicators.
- Administrative data were underreported, undocumented, and were found to offer weaker data than what was produced by surveys.
- Improvements to household surveys to better distinguish between residents would make these surveys more useful in reporting on gender-relevant SDG indicators.
The study has documented the persistence of large gender data gaps in both national and international databases and has established a baseline measurement of the availability of indictors in the 15 study countries. Taken together, they contain 60 percent of the population of Sub-Saharan Africa and 8.5 percent of the world’s population. Only 52 percent of the 104 indicators identified as important for measuring the well-being of women have one or more sex-disaggregated observations in the 15 study countries. For the SDG indicators included in the study dataset the availability rate is 45 percent in national databases and 44 percent in international databases.
While the availability rates of national and international databases are nearly identical, there are large differences within six topical domains. Availability rates for sex-disaggregated data in international databases are highest in the health domain, at 73 percent, but only 58 percent in national database. Availability rates for education are highest in national databases at 60 percent, and second highest, at 54 percent, in international databases. Availability rates are lowest in both sources in the environment domain, where 90 percent of 8 indicators in 15 countries lack sex-disaggregated data. Most of the indicators that lack sex-disaggregated data are classified as Tier 2, for which agreed upon methodologies are available but global coverage is acknowledged to be low. Nevertheless, the low coverage rates in 15 Sub-Saharan countries points to the need for improved efforts to build capacity, collect data, and publish the resulting indicators with sex and other recommended disaggregations.
Indicators reported in international databases, such as the UN SDG Global Database or the World Bank’s World Development Indicators usually conform to international standards. National databases are more likely to report indicators that are related to but differ from the descriptions and metadata published by international agencies. While not conforming to international standards, these indicators may satisfy domestic needs.
The study showed that international databases are more selective in their reporting than national databases. While the average number of observations available per indicator is higher, there are 9 indicators without any data in the international databases in any of the 15 countries and 19 more indicators that lack any sex-disaggregated data. Indicator frequency is lower in national databases, but there are only two indicators that lack any data in any of the 15 countries, one being the poverty headcount at the international poverty line produced by the World Bank. There are 4 more indicators that lack sex-disaggregation in national databases. Three of these are environment indicators that are also missing from international databases.
More than a third of indicators with data in national databases have available observations within the last three years, with the largest share occurring in 2016 (indicators are not the same in every country). Data published within two years of the reference period is considered to be reasonably timely, but for the remaining indicators, the most recent observations are much older. 17 percent of the observations were last published in 2010 or 2011, making them only of historical interest at this point. Regular updating and timely release are needed to support planning, monitoring, and decision making.
Patterns of data availability, predictably, vary across the countries assessed in this study. In national databases, the number of indicators with one or more observations is lowest in Kenya and Lesotho (54) and highest in Ghana (83). However, data frequency is highest in South Africa, where there was an average of 4.9 observations per indicator, and lowest in Ethiopia, with only 1.1. These results may reflect a strategic choice by some countries to produce higher frequency data for a more limited number of indicators, as South Africa has only 61 indicators with one or more data points. Others may choose wider but less frequent coverage, but it also seems likely that low frequencies reflect the inability of countries and donors to sustain statistical programs.
Main takeaways on microdata sources
A review of the microdata sources used to produce the 68 gender-relevant SDG indicators found that 14 of the 15 countries are dependent on internationally sponsored surveys such as MICS and DHS for a large share of their data. These and similar surveys are the principal survey sources for health and education indicators. Only Botswana, which has the highest average income of the study countries, has not conducted a MIC or DHS survey. For economic indicators, labor force surveys are the principal source of data. Each country in the study conducted at least one living standard study to collect information on household income or consumption and, along with the population census, collect data on housing conditions and other assets. The remaining microdata sources were specialized surveys or administrative records.
Administrative data can and should play an important role in the production of official statistics, but they are underreported and undocumented. In many cases the use of administrative data could only be inferred from general statements about the ministry or agency responsible for compiling the indicators without an identifiable database or supporting metadata. The study results also point to the weaknesses of administrative data. In countries without well-functioning civil registration and vital statistics systems, censuses and surveys remain the only way of adjusting incomplete data to produce representative statistics. Because administrative systems differ from country to country, further adjustments are needed to produce cross-country comparable data. The relatively large number of related but non-conforming indicators found in national databases may reflect the use of administrative sources.
Household surveys have their own limitations for reporting gender data. Unless data are collected separately from men and women (and children and other dependents) in the same household, only a single household data point is available. And even when separate schedules are used, collective consumption and the use of household assets are not distinguishable between residents. This has consequences for a number of SDG indicators, including all the indicators for Goal 1 and for other indictors in Goals 2, 3, 6, 10, and 11.
One important goal of this study was to identify examples of countries able to produce sex-disaggregated indicators that might serve as a model for countries with missing indicators or indicators that lack sex-disaggregation. Summary pages for each of the 68 gender-relevant SDG indicators will be made available at a later date and will include a definitional extract of the international metadata and the principal or most recent source of microdata used to produce the indicator in each country. These pages can be used by national statistical offices, international data production sponsors and others tasked with meeting the data demands of the SDG indicators to improve data collection methodology and produce more useful, more consistent sex-disaggregated data. These sheets will also examine whether data are collected without producing SDG indicators to identify SDG indicators as having data collection or data production gaps. The summary pages, along with the study methodology, will be made available as a separate output of the project.
Recommendations
The results of the study represent the best effort to date to locate data for defined gender-relevant indicators in international and national databases. The resulting map of existing gender data gaps should inform the national statistical offices and custodian agencies responsible for producing them. Similarly, the methodology and metadata should also be a learning tool to review the current status of data availability for sex-disaggregated data in the countries studied and other countries around the world. This study should serve as the basis for a discussion of forward-looking plans to fill data gaps through new surveys and censuses, improvements to administrative records, or exploration of new data sources.
Where existing microdata are sufficient to produce sex-disaggregated indicators as identified by this study, a systematic effort to compile and disseminate disaggregated data should be undertaken. This study is intended to complement the World Bank’s proposed extension of the World Bank’s Gender Data Navigator to identify specific survey items needed to construct SDG gender indicators.
Many indicators included in the study were available only once or twice in the nine-year period under review and more than 60 percent of the more recent observations were 4 years old or older. Using the information collected in the study, countries and their development partners should establish a data collection program that provides data at adequate frequency and timeliness.
Administrative data offer little or no access to microdata or metadata. Investments are needed to make granular data from administrative systems more available for public access. This will require the development of access systems, appropriate controls to safeguard privacy of individual records and documentation and dissemination of both the microdata and the indicators derived from them.
Next steps
- Further work is needed to assess the microdata sources used to produce important gender indicators. Going forward, countries, their development partners, and SDG custodian agencies need to establish data collection programs that meet the need for timely, high-quality gender data.
- The country assessments will be shared with national statistical offices or other national specialists to support further documentation of gender statistics and to support strategies for filling gaps.
- As the open data movement expands its scope, administrative databases should receive more attention and resources to match their ability, when constructed appropriately, to produce highly relevant sex-disaggregated data.
- Data2X and Open Data Watch will publish summary pages for each of the 68 gender-relevant SDG indicators and make them available at a later date. These pages will offer insights into what should be done to improve data collection and production to for each of the gender-relevant SDG indicators.
- Data2X and Open Data Watch will promote the findings of this study with the international gender-data community and solicit feedback on the research, conclusions, and recommendations to inform further exploration of the data generated by this study.
Bibliography
Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. 2014. Mapping Gender Data Gaps. Data2X.
https://www.data2x.org/wp-content/uploads/2017/11/Data2X_MappingGenderDataGaps_FullReport.pdf.
Data2x and Open Data Watch. 2016. Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets.
https://opendatawatch.com/wp-content/uploads/2016/03/ready-to-measure.pdf.
Inter-agency and Expert Group on Gender Statistics (IAEG-GS). 2017. The United Nations Minimum Set of Gender Indicators. United Nations Statistics Division, 6 June 2017. https://genderstats.un.org/files/Minimum%20Set%20indicators%20web.pdf
Inter-agency and Expert Group on the Sustainable Development Goals (IAEG-SDG). 2018. Tier Classification for Global SDG Indicators. 15 October 2018. https://unstats.un.org/sdgs/files/Tier%20Classification%20of%20SDG%20Indicators_15%20October%202018_web.pdf
UN Women. 2017. Gender-related Sustainable Development Goal Indicators. https://www.data2x.org/wp-content/uploads/2017/05/UNWomenList_GenderSDGIndicators.pdf
UN Women. 2018. Turning Promises Into Action: Gender Equality in the 2030 Agenda for Sustainable Development.
http://www.unwomen.org/-/media/headquarters/attachments/sections/library/publications/2018/sdg-report-gender-equality-in-the-2030-agenda-for-sustainable-development-2018-en.pdf?la=en&vs=4332
United Nations Statistical Commission (UNSC). 2013. Gender Statistics: Report of the Secretary General. http://www.un.org/ga/search/view_doc.asp?symbol=E/CN.3/2013/10
United Nations. 2018. Department of Economic and Social Affairs. Statistics Division. SDG Indicators Metadata Repository.
https://unstats.un.org/sdgs/metadata/.
United Nations Development Program. 2018. Human Development Indices and Indicators: 2018 Statistical Update. http://hdr.undp.org/sites/default/files/2018_human_development_statistical_update.pdf.
World Bank. 2018. Poverty and Shared Prosperity Report. http://www.worldbank.org/en/publication/poverty-and-shared-prosperity.
World Bank. n.d. Gender Data Navigator.
http://datanavigator.ihsn.org/
World Economic Forum. 2018. Global Gender Gap Report. http://www3.weforum.org/docs/WEF_GGGR_2018.pdf
Annex 1: Study indicators
Table A1 lists the 104 indictors included in the assessments. SDG indictors are shown with the indicator number and tier classification assigned by the IAEG-SDG. Indicators from the Minimum Set of Gender Indicator or from UN Women’s list of supplemental indicators were assigned a number by the research team based on their relationship to the SDG targets with the addition of an alphabetic character in the rightmost position. Tier classification for the Minimum Set indicators were assigned by the IAEG-GS.
The original source of the indicators is coded as follows:
- AGI Additional gender indicators in SDGs
- MIN Indicators from the Minimum Set of Gender Indicators proposed by IAEG-GS
- SUP Supplemental indicators proposed by UN Women
- UNW SDG gender indicators identified by UN Women
The development data domains are coded as follows:
- ECON Economic
- EDUC Education
- ENVT Environmental sustainability
- HEAL Health
- HUMN Human security
- PART Public participation
Annex Table 1 Gender-relevant indicators
Indicator number |
Source |
Indicator |
Tier |
Domain |
1.1.1 |
UNW |
Proportion of population below the international poverty line, by sex, age, employment status and geographical location (urban/rural) |
Tier I |
ECON |
1.2.1 |
UNW |
Proportion of population living below the national poverty line, by sex and age |
Tier I |
ECON |
1.2.2 |
UNW |
Proportion of men, women and children of all ages living in poverty in all its dimensions according to national definitions |
Tier II |
ECON |
1.3.1 |
UNW |
Proportion of population covered by social protection floors/systems, by sex, distinguishing children, unemployed persons, older persons, persons with disabilities, pregnant women, newborns, work- injury victims and the poor and the vulnerable |
Tier II |
ECON |
1.5.1 |
AGI |
Number of deaths, missing persons and directly affected persons attributed to disasters per 100,000 population |
Tier II |
ENVT |
2.1.1 |
AGI |
Prevalence of undernourishment |
Tier I |
HEAL |
2.1.X |
SUP |
Prevalence of anemia among women of reproductive age |
|
HEAL |
2.2.1 |
AGI |
Prevalence of stunting (height for age <-2 standard deviation from the median of the World Health Organization (WHO) Child Growth Standards) among children under 5 years of age |
Tier I |
HEAL |
2.2.2 |
AGI |
Prevalence of malnutrition (weight for height >+2 or <-2 standard deviation from the median of the WHO Child Growth Standards) among children under 5 years of age, by type (wasting and overweight) |
Tier I |
HEAL |
2.2.X |
MIN |
Proportion of adults who are obese, by sex |
GS Tier 1 |
HEAL |
2.2.Y |
SUP |
Share of women aged 15-49 whose BMI is less than 18.5 (underweight) |
|
HEAL |
3.1.1 |
UNW |
Maternal mortality ratio |
Tier II |
HEAL |
3.1.2 |
UNW |
Proportion of births attended by skilled health personnel |
Tier I |
HEAL |
3.1.X |
MIN |
Antenatal care coverage |
GS Tier 1 |
HEAL |
3.2.1 |
AGI |
Under-five mortality rate |
Tier I |
HEAL |
3.2.2 |
AGI |
Neonatal mortality rate |
Tier I |
HEAL |
3.3.1 |
UNW |
Number of new HIV infections per 1,000 uninfected population, by sex, age and key populations |
Tier II |
HEAL |
3.3.2 |
AGI |
Tuberculosis incidence per 100,000 population |
Tier I |
HEAL |
3.3.3 |
AGI |
Malaria incidence per 1,000 population |
Tier I |
HEAL |
3.3.4 |
AGI |
Hepatitis B incidence per 100,000 population |
Tier II |
HEAL |
3.3.5 |
AGI |
Number of people requiring interventions against neglected tropical diseases |
Tier I |
HEAL |
3.3.X |
MIN |
Women’s share of population aged 15-49 living with HIV/AIDS |
GS Tier 1 |
HEAL |
3.3.Y |
MIN |
Access to anti-retroviral drug, by sex |
GS Tier 1 |
HEAL |
3.4.1 |
AGI |
Mortality rate attributed to cardiovascular disease, cancer, diabetes or chronic respiratory disease |
Tier II |
HEAL |
3.4.2 |
AGI |
Suicide mortality rate |
Tier II |
HEAL |
3.4.X |
MIN |
Life expectancy at age 60, by sex |
GS Tier 1 |
HEAL |
3.4.Y |
MIN |
Adult mortality by cause and age groups |
GS Tier 1 |
HEAL |
3.5.2 |
AGI |
Harmful use of alcohol defined according to the national context as alcohol per capita consumption (aged 15 years and older) within a calendar year in liters of pure alcohol |
Tier I |
HEAL |
3.6.1 |
AGI |
Death rate due to road traffic injuries |
Tier I |
HEAL |
3.7.1 |
UNW |
Proportion of women of reproductive age (aged 15-49 years) who have their need for family planning satisfied with modern methods |
Tier I |
HEAL |
3.7.2 |
UNW |
Adolescent birth rate (aged 10-14 years; aged 15-19 years) per 1,000 women in that age group. (The 10-14 age group was not included in the indicator assessments.) |
Tier II |
HEAL |
3.7.X |
MIN |
Contraceptive prevalence among women who are married or in a union, aged 15-49 |
GS Tier 1 |
HEAL |
3.9.1 |
AGI |
Mortality rate attributed to household and ambient air pollution |
Tier I |
ENVT |
3.9.2 |
AGI |
Mortality rate attributed to unsafe water, unsafe sanitation and lack of hygiene. |
Tier II |
ENVT |
3.9.3 |
AGI |
Mortality rate attributed to unintentional poisoning |
Tier II |
HEAL |
3.a.1 |
AGI |
Age-standardized prevalence of current tobacco use among persons aged 15 years and older |
Tier I |
HEAL |
4.1.1 |
AGI |
Proportion of children and young people: (a) in grades 2/3; (b) at the end of primary; and (c) at the end of lower secondary achieving at least a minimum proficiency level in (i) reading and (ii) mathematics, by sex |
Tier III (a)/ |
EDUC |
4.1.X1 |
MIN |
Adjusted net enrolment rate in primary education by sex |
GS Tier 1 |
EDUC |
4.1.X2 |
MIN |
Gross enrolment ratio in secondary education, by sex |
GS Tier 1 |
EDUC |
4.1.X3 |
MIN |
Gross enrolment ratio in tertiary education, by sex |
GS Tier 1 |
EDUC |
4.1.X4 |
SUP |
Illiteracy rates, by sex |
|
EDUC |
4.1.X5 |
MIN |
Adjusted net intake rate to the first grade of primary education, by sex |
GS Tier 1 |
EDUC |
4.1.X6 |
SUP |
Proportion of women with six or less years of education |
|
EDUC |
4.1.X7 |
MIN |
Primary education completion rate (proxy), by sex |
GS Tier 1 |
EDUC |
4.1.X8 |
MIN |
Gross graduation ratio from lower secondary education, by sex |
GS Tier 1 |
EDUC |
4.1.X9 |
MIN |
Effective transition rate from primary to secondary education (general programs), by sex |
GS Tier 1 |
EDUC |
4.1.X10 |
SUP |
Proportion of women with less than a high school diploma |
|
EDUC |
4.2.2 |
UNW |
Participation rate in organized learning (one year before the official primary entry age), by sex |
Tier I |
EDUC |
4.3.1 |
UNW |
Participation rate of youth and adults in formal and non-formal education and training in the previous 12 months, by sex |
Tier II |
EDUC |
4.3.X |
SUP |
Primary and secondary out of school rates, by sex |
EDUC |
|
4.4.1 |
AGI |
Proportion of youth and adults with information and communications technology (ICT) skills, by type of skill |
Tier II |
EDUC |
4.4.X |
MIN |
Share of female science, engineering, manufacturing and construction graduates at tertiary level |
GS Tier 1 |
EDUC |
4.5.X |
MIN |
Educational attainment of the population aged 25 and older, by sex |
GS Tier 1 |
EDUC |
4.6.X1 |
MIN |
Youth literacy rate of persons (15-24 years), by sex |
GS Tier 1 |
EDUC |
4.6.1 |
UNW |
Proportion of population in a given age group achieving at least a fixed level of proficiency in functional (a) literacy and (b) numeracy skills, by sex |
Tier II |
EDUC |
4.a.1 |
UNW |
Proportion of schools with access to: (f) single-sex basic sanitation facilities |
Tier II |
EDUC |
4.c.1 |
AGI |
Proportion of teachers in: (a) pre-primary; (b) primary; (c) lower secondary; and (d) upper secondary education who have received at least the minimum organized teacher training (e.g. pedagogical training) pre-service or in-service required for teaching at the relevant level in a given country |
Tier I |
EDUC |
4.c.X |
MIN |
Proportion of females among tertiary education teachers or professors |
GS Tier 1 |
EDUC |
5.2.1 |
UNW |
Proportion of ever-partnered women and girls aged 15 years and older subjected to physical, sexual or psychological violence by a current or former intimate partner in the previous 12 months, by form of violence and by age |
Tier II |
HUMN |
5.2.2 |
MIN |
Proportion of women (aged 15-49) subjected to sexual violence by persons other than an intimate partner, since age 15* |
GS Tier 2 |
HUMN |
5.3.1 |
UNW |
Proportion of women aged 20-24 years who were married or in a union before age 15 and before age 18 |
Tier II |
HUMN |
5.3.2 |
UNW |
Proportion of girls and women aged 15-49 years who have undergone female genital mutilation/cutting, by age |
Tier II |
HUMN |
5.4.1 |
UNW |
Proportion of time spent on unpaid domestic and care work, by sex, age and location |
Tier II |
ECON |
5.4.X |
MIN |
Average number of hours spent on paid and unpaid domestic work combined (total work burden), by sex |
GS Tier 2 |
ECON |
5.5.1 |
UNW |
Proportion of seats held by women in (a) national parliaments and (b) local governments† |
Tier I |
PART |
5.5.2 |
UNW |
Proportion of women in managerial positions |
Tier I |
PART |
5.5.X1 |
MIN |
Women’s share of government ministerial positions |
GS Tier 1 |
PART |
5.5.X2 |
MIN |
Percentage of female police officers |
GS Tier 2 |
PART |
5.5.X3 |
MIN |
Percentage of female judges |
GS Tier 2 |
PART |
5.6.1 |
UNW |
Proportion of women aged 15-49 years who make their own informed decisions regarding sexual relations, contraceptive use and reproductive health care |
Tier II |
HEAL |
5.6.X |
SUP |
Proportion of women who have an independent/joint say in own health care |
|
HEAL |
5.a.1 |
UNW |
(a) Proportion of total agricultural population with ownership or secure rights over agricultural land, by sex; and (b) share of women among owners or rights-bearers of agricultural land, by type of tenure |
Tier II |
ECON |
5.b.1 |
UNW |
Proportion of individuals who own a mobile telephone, by sex |
Tier I |
ECON |
6.1.1 |
AGI |
Proportion of population using safely managed drinking water services |
Tier I |
ENVT |
6.2.1 |
AGI |
Proportion of population using safely managed sanitation services, including a hand-washing facility with soap and water |
Tier I |
ENVT |
7.1.X |
SUP |
Proportion of women with access to clean cooking fuel |
|
ENVT |
8.10.2 |
AGI |
Proportion of adults (15 years and older) with an account at a bank or other financial institution or with a mobile-money-service provider |
Tier I |
ECON |
8.3.1 |
UNW |
Proportion of informal employment in non-agriculture employment, by sex |
Tier II |
ECON |
8.3.X1 |
MIN |
Proportion of employed who are own-account workers, by sex |
GS Tier 1 |
ECON |
8.3.X2 |
MIN |
Proportion of employed who are contributing family workers, by sex |
GS Tier 1 |
ECON |
8.3.X3 |
MIN |
Proportion of employed who are employers, by sex |
GS Tier 1 |
ECON |
8.3.X4 |
MIN |
Proportion of employed working part-time, by sex |
GS Tier 2 |
ECON |
8.5.1 |
UNW |
Average hourly earnings of female and male employees, by occupation, age and persons with disabilities |
Tier II |
ECON |
8.5.2 |
UNW |
Unemployment rate, by sex, age and persons with disabilities |
Tier I |
ECON |
8.5.X |
SUP |
Labor force participation rate, by sex |
ECON |
|
8.6.1 |
AGI |
Proportion of youth (aged 15-24 years) not in education, employment or training |
Tier I |
ECON |
8.7.1 |
UNW |
Proportion and number of children aged 5-17 years engaged in child labor, by sex and age |
Tier I |
ECON |
8.8.1 |
UNW |
Frequency rates of fatal and non-fatal occupational injuries, by sex and migrant status |
Tier I |
HEAL |
9.2.2 |
AGI |
Manufacturing employment as a proportion of total employment |
Tier I |
ECON |
9.2.X |
MIN |
Percentage distribution of employed population by sector, each sex (Sectors here refer to Agriculture; Industry; Services) |
GS Tier 1 |
ECON |
10.1.1 |
AGI |
Growth rates of household expenditure or income per capita among the bottom 40 per cent of the population and the total population |
Tier I |
ECON |
11.1.1 |
AGI |
Proportion of urban population living in slums, informal settlements or inadequate housing |
Tier I |
ENVT |
11.2.1 |
UNW |
Proportion of population that has convenient access to public transport, by sex, age and persons with disabilities |
Tier II |
ENVT |
16.1.1 |
UNW |
Number of victims of intentional homicide per 100,000 population, by sex and age |
Tier I |
HUMN |
16.1.3 |
AGI |
Proportion of population subjected to physical, psychological or sexual violence in the previous 12 months |
Tier II |
HUMN |
16.1.4 |
AGI |
Proportion of population that feel safe walking alone around the area they live |
Tier II |
HUMN |
16.2.1 |
AGI |
Proportion of children aged 1-17 years who experienced any physical punishment and/or psychological aggression by caregivers in the past month |
Tier II |
HUMN |
16.2.2 |
UNW |
Number of victims of human trafficking per 100,000 population, by sex, age and form of exploitation |
Tier II |
HUMN |
16.2.3 |
UNW |
Proportion of young women and men aged 18-29 years who experienced sexual violence by age 18 |
Tier II |
HUMN |
16.3.1 |
AGI |
Proportion of victims of violence in the previous 12 months who reported their victimization to competent authorities or other officially recognized conflict resolution mechanisms |
Tier II |
HUMN |
16.3.2 |
AGI |
Unsentenced detainees as a proportion of overall prison population |
Tier I |
HUMN |
16.5.1 |
AGI |
Proportion of persons who had at least one contact with a public official and who paid a bribe to a public official, or were asked for a bribe by those public officials, during the previous 12 months |
Tier II |
PART |
16.9.1 |
AGI |
Proportion of children under 5 years of age whose births have been registered with a civil authority, by age |
Tier I |
PART |
17.8.1 |
AGI |
Proportion of individuals using the Internet |
Tier I |
ECON |
Annex 2: Indicator assessment methodology
Indicator selection
A list of 104 indicators of relevance for identifying the status and welfare of women was selected from the gender indicators proposed by United Nations’ Inter-agency and Expert Group on Gender Statistics (IAEG-GS) or by UN Women or included in the Sustainable Development Goals (SDGs). The combined set investigated in this report is comprised of 32 SDG indicators identified by UN Women as gender-relevant (UNW); 36 additional SDG indicators identified by Open Data Watch as being capable of sex-disaggregation or having other gender relevance (AGI); 27 indicators included in IAEG-GS Minimum Gender Indicator list that were not included in the SDGs (MIN); and 9 supplemental indicators (SUP) suggested by UN Women in their publication Turning Promises into Action (2017).
The complete list of indicators selected for the research project is shown in Annex 1.
Tier classification
Indicators were classified by UN Women and by the Inter-agency and Expert Group on the SDGs (IAEG-SDGs) in one of three tiers:
- Tier 1 have an internationally established methodology and are regularly produced by at least 50 percent of countries;
- Tier 2 indicators have an established methodology but are not regularly produced by countries;
- Tier 3 indicators lack an established methodology.
Only Tier 1 and Tier 2 indicators were included in the research set. The tier classification used was the one available in June of 2017. The IAEG-SDG has updated the tier classification at its semi-annual meetings. The tier classification in the current list of indicators has been updated through 15 October 2018. (IAEG-SDG 2018) But at least one indicator with sex-disaggregation has been moved from Tier 3 to Tier 2 since the research list was drawn up and is omitted here. The tier classification of each indicator is shown in Annex 1.
Domain typology
Mapping Gender Data Gaps (Data2X 2014) identified five statistical domains of development and gender interest: health, education, economic opportunities, political participation, and human security. In the current project an additional domain, encompassing environmental indicators was added. The indicators used in the current study were classified in six domains:
- Health (HEAL)
- Education (EDUC)
- Economic opportunities and access to resources (ECON)
- Public life and participation (PART)
- Human rights and security of women and children (HUMN)
- Environment and sustainability (ENVT)
The domain assignment of each indicator is shown in Annex 1.
Country selection
This project looked at gender data gaps for 15 countries in Sub-Saharan Africa. We focused our work on Sub-Saharan African because of the challenges the region faces when it comes to meeting the gender data demands of the SDGs. There was a careful selection of a diverse group of countries in the region – both in terms of income status and statistical capacity level – in order to better understand the realities that exist within different contexts. Additionally, the selected countries are strongly linked to Data2X’s gender data focal points program. These assessments can offer tailored possibilities for overcoming barriers which our focal points for those countries can take up.
The fifteen selected countries have been designated by Data 2X as focal countries: Uganda, Senegal, Rwanda, Kenya, Botswana, Lesotho, Malawi, Tanzania, Ethiopia, Nigeria, South Africa, Zambia, Zimbabwe, Cote d’Ivoire, and Ghana.
Country assessments
The project looked at international and national-level availability of the selected indicators from 2010 onwards for the 15 focal countries. Differences between the availability of indicators (and their disaggregations) in international and national databases occur because of adjustments made for international comparability; the use of different data collection instruments; or because countries choose not to publish indicators produced by international agencies.
A standard spreadsheet template was used for the 15 country assessments. The template listed the goal, target, indicator, and reference numbers, indicator name, tier classification, and the designated custodian or primary international agency for each indicator. The assessments were carried out by consultants and Open Data Watch staff under the supervision of the project manager and the director of research.
Assessing data availability for SDG indicators in international databases was a two-step process: the team first looked for data in the SDG Global Database maintained by the UN Statistics Division and then looked for data on the website(s) of the so-called custodian agencies or the World Bank’s World Development Indicators. For non-SDG indicators, assessors looked for data published by inter-governmental organizations that are primarily responsible for publishing relevant statistics for the topic of interest. For the purpose of this report, these are referred to as primary international agencies.
At the national level, the team looked for data located on national-level websites maintained by the national statistical offices and other ministries and agencies that are responsible for disseminating statistics. In some cases, Google searches were used to locate sources. Research reports or other non-official sources were included if they were based on data collected in the country and were well documented. Non-official reports located on websites maintained within the country were treated as national data. As a practical matter, precedence was given to high-level, readily accessible sites of official agencies.
Because of the ambiguous provenance of data stored on Open Data for Africa portals and DevInfo databases, these sources were not included as national or international data.
Assessors were responsible for providing the following information:
- International data source and metadata location: For SDG indicators, assessors looked for data located on the SDG Global Database and on websites of custodian agencies. For non-SDG indicators, assessors looked for data on websites of the primary international agency. Upon completion of the indicator search, assessors recorded the following information: source, navigation instructions for generating datasets (if applicable), indicator name, observations, available disaggregations, and the metadata URL.
- International data availability notes: Assessors provided brief notes, explaining the availability of data and the basis for scoring.
- International data availability score: Based on the availability of data on the SDG Global Database and the custodian agency (or primary international agency), assessors assigned a score of:
- A: specified indicator available with all suggested disaggregations;
- B: indicator available but lacks one or more disaggregations;
- C: related indicator with or without disaggregations;
- X: not available
- International data – sex-disaggregation score: After scoring data availability, assessors scored whether the indicator was disaggregated by sex, with the options of:
- A: sex-disaggregation available;
- F: female-specific indicator[6]
- X: not available
Thus, an indicator could be scored as lacking one or more disaggregations (B) but still receive a score of A or F for sex-disaggregation. Or an indicator with multiple disaggregations could be scored as BX if it was not published with sex-disaggregation.
- National data source and metadata location: Assessors were encouraged to catalogue the of availability of publications, datasets, and databases across national statistical offices and ministry websites to help them easily access relevant publications. The project manager provided guidance and assistance to assessors to find reports of Demographic and Health Surveys, Multiple Indicator Cluster Surveys, Living Standard Measurement Surveys, Labor Force Surveys, censuses, along with statistical yearbooks and similar sources. When finding relevant publications or datasets, Google search were used as well. Assessors were encouraged to be aware of the structure of government URLs, as this can help refine their Google search (such as: “Uganda literacy rates education go.ug”).
Upon completion of the indicator search, assessors recorded the following information: link to dataset or publication, name of dataset, observations, disaggregation, and metadata. For PDF publications, assessors will provide page numbers.
- National data availability notes: Assessors provided brief notes, explaining the availability of data and the basis for scoring.
- National data availability score: Scoring was based on the same rubric used for international data.
- National data – sex-disaggregation score: Scoring was based on the same rubric used for international data.
Notes and observations on the indicator assessment process
- Because of lags in data collection and compilation, data from the most recent years are likely to be missing, so data that are currently in preparation but unpublished are not included, nor do the results fully reflect new data collection programs initiated after the announcement of the SDGs in 2015.
- The dates of publication for indicators were recorded, but scores are not based on the number of observations, but rather, the overall availability of an indicator (even if there is only one observation).
- If an indicator has a score of “A” for international or national level availabilities, then the score of sex-disaggregation must be “A” or “F.”
- For the indicator availability score of “A,” the specified indicator must entirely match the indicator, along with having all required disaggregations, including sex. There are instances where international and national-level availability sources match the SDG indicator by having sex-disaggregation, but they are still missing a component. An example is indicator 8.5.2: unemployment rate by sex, age, and persons with disabilities. Fourteen countries reported data for this indicator. Sex and age disaggregations were generally available; however, in 9 countries disability status was not available. Since a component of an indicator is missing, a score of B was awarded for indicator availability for these countries.
Challenges
- The major challenge the assessment team encountered was the updating of the SDG Global Database, which occurred during the assessment period. With the updates, indicator availability or number of observations for an indicator were affected; furthermore, direct URLs to country-level datasets are no longer available. Fortunately, only three countries were affected: Uganda, Kenya, Senegal; which required the assessment team to redo the SDG Global Database sources for these countries.
- Locating data availability on the national-level is not as structured as international-level availability sources. During the research period, assessors accessed multiple national-level websites and publications to find relevant datasets. Furthermore, reports and statistical compilations derived from surveys or entries in statistical abstracts and yearbooks may show only one observation, or an incomplete time series. Therefore, assessors accessed multiple publications to record multiple observations.
Box 1 highlights challenges with national-level data availability at the country level.
Box 1 . Challenges in finding data from national sources
Botswana |
For some annual publications, such as Vital Statistics Reports or Crime Statistics Reports, assessors faced additional challenges of documenting each annual publication to record an observation. |
Côte d’Ivoire |
Greater reliance in the use of Demographic and Health Survey reports, which are available on the NSO website. Publications from ministry websites did not have datasets that fully meet the selected indicators — either similar indicators are available, or the data are referenced within text. For other indicators where DHS reports are not referenced, the assessor noticed discrepancies in the timeliness of data between the NSO website and other ministry websites. While data from 2010 onwards are available on the NSO website, the assessor noted that the ministry website has more recent data. |
Ethiopia |
On the Ministry of Education website, reports such as Education Statistics Annual Abstract 2015/16 have education data prior to 2008 available. While the reports are new, the observations are available for years prior to 2010, which cannot be recorded per our methodology. |
Ghana |
The assessor solely relied on the Ghana Statistical Service website, as other ministry websites had very little to no relevant data. The assessor encountered challenges in sorting through several different publications that have data on similar themes (such as health and education). With similar data being located across different publications, which lead to greater challenges in finding the best datasets. |
Kenya |
Kenya publications sometimes will include graphs with no data labels, and corresponding tables were not included in the publication. In such cases, we were not able to use that data since there were no data values available. Also, several Statistical Abstracts had coverage for only one specific year, so multiple versions needed to be kept open to verify data points. |
Lesotho |
For many indicators, various observations were available in different reports in the Lesotho Bureau of Statistics website — this is a difference from the challenges we faced in assessing Botswana. An example is of indicator 9.2.2 – manufacturing employment as a proportion of total employment, where 2011 data are available in Lesotho’s DHS report and 2014/15 data are available in the Continuous Multi-Purpose Survey 3rd Quarter 2014/15 report. The assessor also observed that the latest DHS report available on the NSO or on any ministry website is from 2011. However, reference of Lesotho Demographic and Health Survey 2014 report is available on DHS website, with weblink to publication which can be downloaded. Although the report is available on the DHS website, the assessor was not able cite the report, as it is not available on the NSO or other ministry websites. |
Malawi |
Many reports available through the national statistical office of Malawi have only one observation available. These reports include the DHS, Welfare Monitoring Survey, Integrated Household Survey reports. Assessors faced additional challenges by going through prior publications in order to record multiple observations. |
Nigeria |
No assessor feedback |
Rwanda |
Many statistical yearbooks on the National Institute of Statistics of Rwanda website were not consistent with table headings and numberings. Depending on the dataset, the statistical yearbooks give data for a single year or omit years all together. |
Senegal |
For some reports on the Agence National de la Statistique et de la Démographie website, such as DHS and Economic and Social Situation of Senegal reports, assessors faced additional challenges by documenting each annual publication to record an observation. |
South Africa |
Many reports available on the Statistics South Africa website have only one observation available. These reports include the Quarterly Labour Force Survey or the General Household Survey. Assessors faced additional challenges by going through prior publications in order to record multiple observations. For many indicators under SDG 8, our assessors recorded over five sources. |
Tanzania |
The primary challenge our assessors faced was distinguishing data between Tanzania (mainland) and the United Republic of Tanzania (Tanzania and Zanzibar). DHS reports available on National Bureau of Statistics of Tanzania, have data available for both Tanzania (mainland) and Zanzibar. However, Integrated Labour Force Survey Analytical Reports available on the NSO website are only available for Tanzania (mainland). Data on the Ministry of Education and the United Republic of Tanzania – Government Basic Statistics Portal are only available for Tanzania (mainland). For the purpose of this research, we allowed the recording and scoring of data availability for Tanzania (mainland). |
Uganda |
For annual reports available on the Uganda Bureau of Statistics website, such as DHS, Education Abstract, and Uganda National Abstract, assessors faced additional challenges by documenting each annual publication to record multiple observations. Our assessors also observed that for the DHS 2011 and 2016 reports are not entirely comparable. An example is of indicator 3.2.2: neonatal mortality rate. The DHS 2011 report has sex-disaggregation available for neonatal mortality rate; however, the 2016 report does not have sex-disaggregation available. |
Zambia |
Assessors encountered website downtime of ministry websites during the assessment period. Due to a lack of centralization of data, there were data on ministry websites that assessors once viewed but could no longer access due to the website downtime. |
Zimbabwe |
Towards the completion of Zimbabwe’s country-level data availability assessment, the website for the Zimbabwe National Statistics Agency website went down for a few days. When the Zimbabwe National Statistics Agency (ZimStat) was functioning, many reports, such as Labour Force Surveys, Demographic and Health Surveys, were no longer available. Additionally, ZimStat’s search function was no longer functioning. Initially, this posed some challenges in verifying accuracy of the datasets, however, the assessor downloaded all relevant publications prior to the website’s shutdown. |
Annex 3: Microdata assessment
Background
The first phase of the project assessed the availability of 104 candidate indicators, including SDG and non-SDG indicators in international and national databases from 2010 onwards across the 15 Data2X focal countries. In the second phase, the research team mapped the sources of the microdata used to construct 68 gender-relevant SDG indicators.
To understand the general landscape of microdata availability across the selected countries, the research team conducted a preliminary mapping exercise to examine the availability of surveys that may be of any relevance to the six domains. These included censuses, household income and expenditure surveys, labor force surveys, Demographic and Health surveys, Multiple Indicator Cluster Surveys, and general household surveys. The primary sources used to find available microdata include the IHSN Data Catalogue, World Bank Microdata Library, and the focal country’s respective NADA portals.
The team then formed a template as a basis for preparing indicator summary pages for the 68 gender-relevant SDG indicators. Each report contains:
- SDG indicator number and indicator description
- Relevant notes from international metadata
- A list of available data from national sources (along with their scores from phase I)
- A list of the principal microdata sources used to construct the indicator for each country with available data
- Conclusions and recommendations
Upon completion of the indicator summary pages, the identified microdata sources were matched to the surveys located through the microdata mapping exercise to record indicators that can be constructed from existing surveys. The underlying data for indicators that were built from administrative sources were difficult to find but the most likely administrative source was noted in the summaries.
An example of an indicator summary page is shown in Annex 4.
Challenges and observations during the microdata assessment
- In examining surveys for potential disaggregation, it was sometimes difficult to know what data had been collected. Some survey modules did not specify whether the questions were directed to all members of a household or to a head of household or other respondent. Without this information, it was difficult to determine whether the data could be constructed only for the household population or for the head of household or whether the survey response might have been influenced by the sex of the respondent.
- Some of the survey questionnaires were in an image or other format that didn’t allow for the searching of key words. Manual searches through these questionnaires took the team much longer to analyze them.
- The UN metadata for some indicators did not provide clear guidance or provided conflicting information on how the indicator should be calculated. One example of this issue is Indicator 5.6.1: Proportion of women aged 15-49 years who make their own informed decisions regarding sexual relations, contraceptive use and reproductive health care. This indicator states that it is supposed to monitor whether women can make their own informed choices but the corresponding questions for the indicator are only in reference to whether women can make their own choices and not on whether or not those choices are informed by the proper information. Without this critical information, it is unclear if the calculations are supposed to record informed choice or not and could cause confusion.
- Indicators that are constructed from administrative datasets did not have the proper source and metadata available for assessors to identify and view the underlying data. When administrative datasets are used to construct the indicators, the ministry responsible for the data is often referenced but without any mention of the name or location of the dataset. Attempts were made to find these data on the ministry websites that were noted but they are either not available or impossible to find without more information about the underlying datasets.
Annex 4: Representative Indicator Summary Sheet
SDG indicator 3.1.1
SDG indicator description: Maternal Mortality Ratio
Relevant notes from the international metadata:
Production method for international databases: Calculated by dividing recorded (or estimated) maternal deaths by total recorded (or estimated) live births in the same period and multiplying by 100 000. Measurement requires information on pregnancy status, timing of death (during pregnancy, childbirth, or within 42 days of termination of pregnancy), and cause of death. The maternal mortality ratio can be calculated directly from data collected through vital registration systems, household surveys or other sources.
Other important issues to note:
-
- To measure maternal mortality in household surveys, the sisters of the mothers in the house are asked about which of their siblings passed away during, soon after, or as a result of pregnancy.
- Maternal mortality is a difficult indicator to measure because of the large sample sizes required to calculate an accurate estimate. This is evidenced by the fact that the MMR is expressed per 100,000 live births, which demonstrates that it is a relatively rare event. As a result, maternal mortality estimates are subject to large sampling errors.
- There are often data quality problems, particularly related to the underreporting and misclassification of maternal deaths. Therefore, data are often adjusted in order to take these data quality issues into account. Some countries undertake these adjustments or corrections as part of specialized/confidential enquiries or administrative efforts embedded within maternal mortality monitoring programs.
Source: https://unstats.un.org/sdgs/metadata/files/Metadata-03-01-01.pdf
Differences in the availability of the indicator at the international and national levels: All of the indicators are available on the national and international levels
Data available from national sources:
The data that were available were all A/F
-
- All of the countries had data available.
- The countries using the DHS for their maternal mortality ratio measurements, however, didn’t directly adhere to the SDG methodology and asked for deaths within 60 days (2 months) and not 42 days after the termination of pregnancy. This is probably because it is easier for people to remember if someone died within 2 months of pregnancy than 6 weeks, but the discrepancy could still cause issues.
- The data that didn’t come from the DHS made it difficult to understand where they came from: Botswana, Ghana, Lesotho, South Africa, and Zimbabwe
Primary sources used to construct the indicator
AF Countries
Country: Botswana Maternal Mortality Ratio 2016
Can SDG indicator be constructed? They are creating the data, but it is unclear where it is coming from and the source is listed just as Botswana data from Ministry of Health and Wellness.
Country: Cote D’Ivoire DHS 2012
Can SDG indicator be constructed? Yes, indicator can be constructed (section 13, p.352).
URL: http://catalog.ihsn.org/index.php/catalog/6036
Country: Ethiopia DHS 2016
Can SDG indicator be constructed? Yes, indicator can be constructed (section 12 of women’s questionnaire).
URL: http://catalog.ihsn.org/index.php/catalog/7199
Country: Ghana Population and Housing Census 2010
Can SDG indicator be constructed? The census is using the prescribed SDG methodology and asking if the death occurred 6 weeks after pregnancy. (page 12 of the URL).
URL: http://catalog.ihsn.org/index.php/catalog/3780
Country: Kenya Demographic and Health Survey, various editions, 1998, 2003,2008/09,2014
Can SDG indicator be constructed? Yes, indicator can be constructed (section 11 of questionnaire).
URL: http://catalog.ihsn.org/index.php/catalog/6510
Country: Lesotho 2011 Demographic Survey
Can SDG indicator be constructed? For the assessment, the Lesotho Demographic Survey 2011 and Census 2016 are cited. Lesotho Demographic Survey 2011 is not a DHS survey, and the 2016 census does not have questionnaire available. Therefore, for the purpose of seeing whether the indicator can be constructed, we will use Lesotho DHS 2014 as a source. Please note that the final report is not accessible on the NSO website. Looking at the questionnaire of DHS 2014, it is possible to construct the SDG indicator (see section 11 of women’s questionnaire).
URL: http://catalog.ihsn.org/index.php/catalog/6666
country: Malawi DHS 2015-16
Can SDG indicator be constructed? Yes, it is possible to construct the SDG indicator (see section 12).
URL: http://catalog.ihsn.org/index.php/catalog/7013
Country: Nigeria DHS 2013
Can SDG indicator be constructed? Yes, it is possible to construct the SDG indicator (see section 10).
URL: http://catalog.ihsn.org/index.php/catalog/4749
Country: Rwanda 2014-15 DHS
Can SDG indicator be constructed? Questions on maternal mortality are not available in the annexes of the DHS report, but since there are detailed datasets on MMR, it is assumed the SDG indicator can be constructed.
URL: http://catalog.ihsn.org/index.php/catalog/7117
Country: Senegal 2010-11 DHS MICS
Can SDG indicator be constructed? Questions on maternal mortality are not available in the annexes of the DHS report, but since there are detailed datasets on MMR, it is assumed the SDG indicator can be constructed.
URL: http://catalog.ihsn.org/index.php/catalog/2461
Country: South Africa MDG Country Report 2015 (Microdata not available)
Can SDG Indicator be constructed: this publication is a compilation of statistics, not a survey report. Therefore, microdata or questionnaires are not available. However, looking at page 10, the sources for MMR are: DHS 1998 and vital registration. In this case, it is assumed that vital registration records are used to calculate this MDG/SDG indicator.
URL: http://www.statssa.gov.za/MDG/MDG_Country%20Report_Final30Sep2015.pdf
Country: Uganda DHS 2016
Can SDG indicator be constructed? Yes, SDG indicator can be constructed (see section MM of women’s questionnaire).
URL: http://catalog.ihsn.org/index.php/catalog/7389
Country: Zambia DHS 2013-2014
Can SDG indicator be constructed? Yes, SDG indicator can be constructed (see section 9 of questionnaire).
URL: http://catalog.ihsn.org/index.php/catalog/6251
Country: Zimbabwe Census 2012
Can SDG indicator be constructed? The questionnaire (question number 37) asks for deaths that happened a month after pregnancy and not the 6 weeks that is recommended in the methodology.
URL: http://catalog.ihsn.org/index.php/catalog/2986
Conclusions and recommendations
Are the methods used by the countries with data sufficient to produce the SDG indicator?
Yes, countries using the DHS method can produce the SDG indicator. Though, because of the high sample size requirements for the indicator, confidence intervals are used for the indicator which can be broad and could make it more difficult to interpret the results and the trajectory of maternal mortality ratio over time. Utilizing surveys or systems with larger sample sizes, such as the census, might be used to increase the sample size and reduce the confidence intervals.
What the common characteristics of the microdata sources used by most countries? What are the major exceptions, if any?
Most of the data reviewed were from DHS and most of the data sources had the same structure and contained the same information. However, all the DHS data sources use a different time interval when asking if the person died due to pregnancy. DHS asks if it was 2 months after pregnancy and the indicator calls for 42 days. On the other hand, the Zimbabwean population census asks about deaths one month after a pregnancy. This discrepancy may not have a large effect on the results, because the time recall between those two time periods may not be significant, but it is worth noting that countries are asking for different time periods and most of them are not following the 6 weeks protocol.
How would you recommend countries currently without data produce the indicator?
All countries surveyed have maternal mortality ratio data but if there is a need more frequent data, then it seems like some of the questions could be tacked on or calculated with censuses or with an improved CRVS. Because the census has a high sample size, this would possibly reduce the confidence intervals in the maternal mortality ratios and provide more precise measurements.
Are there other possible sources that could be used to create this indicator?
Aside from household surveys, civil registration and vital statistics may be used to create this indicator (as in the case of South Africa). Though, this depends on the strength of these CRVS systems.
Footnotes:
-
↑ The World Bank is currently working to update the GDN by identifying microdata sources for producing gender-relevant SDG questions from surveys in eight countries.
-
↑ Indicators that did not exactly match the definition of the indicator given in the SDGs or in the original source list were classified as “related” indicators with their disaggregations recorded.
-
↑ The full set of indicators is listed in Annex 1 and Table A1 summarizes their original source.The classification of each indicator is shown in Annex 1.
-
↑ In the education domain, where indicators are typically reported by school stage and disaggregated by sex or other characteristics, some indicators also specify components that are typically recorded as separate indicators with their own disaggregations. For example, indicator 4.a.1 specifies seven separate component indicators, including the number of schools with single sex sanitation facilities.
-
↑ Female-specific was designated for indicators that only pertain to girls, women, or single-sex.