The Internet and World Wide Web have become the principal gateway to official statistics distributed by national statistical offices (NSOs) and other government agencies. Almost every NSO maintains a website or data portal—some more than one—that offers access to official statistics. Websites are an efficient means of dissemination because they can distribute large volumes of data at very low cost. However, their effectiveness depends whether data users are aware of the websites or their contents and, if so, whether those users are able to find the information they are seeking.
Web analytics are a means of measuring, collecting, and analyzing website data and the behavior of users on a website. They are widely used by commercial and non-profit organizations to measure the performance of websites and to improve the experience of users online. There are a variety of free and proprietary software systems that can be used to generate and monitor web traffic data. Google Analytics is a widely adopted analytics tool because it is free, easy to implement, and it effectively reports actionable data that allows organizations to optimize their web content.
This report was undertaken to help NSOs and their partners better understand the benefits of web analytics tools and assist them in implementing web analytics on their websites or data portals. These tools are an important part of the data revolution and can contribute to increasing the availability and accessibility of data. NSO web managers and partners should understand that collecting and analyzing web traffic data with an analytics tool is a critical first step in establishing a complete dissemination strategy to better serve a website’s users.
Open Data Watch, in partnership with PARIS21, invited seven NSOs in low- or middle-income countries to participate in a study analyzing web traffic on their principal websites or data portals using Google Analytics. They were selected to include at least one NSO from major geographic regions: South America (1), Sub-Saharan Africa (3), Eastern Europe (1), Southeast Asia (2). The selected countries differ in size and in their level of economic development and are not necessarily representative of their regions or of low- and middle-income countries in general. However, the results reveal patterns of website use that provide a useful reference for other NSOs and their website managers. They also serve as an example for analyzing web traffic data that can be applied to other websites.
According to a survey conducted jointly by AidData and Open Data Watch, NSO managers identified web analytics as the most important way of monitoring their data dissemination programs. Importantly, this study revealed that website users are seeking and do find data through the websites and data portals we analyzed, confirming that NSO websites and data portals are not graveyards but serve a vital role in making data available to the public and other stakeholders. However, the study also found that most participating websites are not configured to collect or make optimal use of the available web analytics data. A set of specific recommendations to improve the structure and configuration of these websites, along with detailed instructions for their implementation, are included with this report. A customized dashboard providing a selection of the most-relevant web traffic data has also been created to serve as a resource for NSOs to monitor the use of their websites and data portals. These are available as annexes to this report.
- This study provides clear evidence that NSO websites and data portals are not data graveyards; they are widely used to access data and serve a vital function in the effort to disseminate data to the public.
- While NSO websites receive a higher volume of web traffic than dedicated data portals, they offer a wider variety of information and services that attract visitors, not all of whom are seeking data.
- Although dedicated data portals receive much lower levels of web traffic than NSO websites, visitors to those sites make more in-depth use of them.
- Search engines drive the most traffic to NSO sites while direct referrals drive most traffic to data portals. We encourage NSOs to implement a search engine optimization plan and should consider strategies to elevate awareness of data portals where necessary.
- Web traffic data can fluctuate because of a variety of external factors. These factors should be considered by NSO web managers when reviewing analytics and updating content to make data more accessible.
- Many countries showed decreases in web traffic during the months of December and January, while other variations were unique to each country. These Increases likely correspond to the release of data or other key moments on the NSO’s calendar.
- Most users arrive at NSO websites through search engines, while most arrive at data portals through direct channels. However, a high amount of traffic to data portals that are closely tied to NSO websites comes from referrals, as users arrive on the data portal through links available throughout the NSO website.
- Domestic users are the source of the greatest number of site visits, which explains why websites from countries with larger populations have higher traffic. Some sites receive significant numbers of foreign visitors, usually from countries with social and economic ties to the country.
- Because of limitations of the structure and naming conventions of the websites and data portals included in this study, it was not possible to estimate the proportion of site visits that sought data or resulted in a data download.
This report was made possible through generous financial support from the William and Flora Hewlett Foundation and is a collaboration between Open Data Watch, AidData at William & Mary, and PARIS21.
Part of a broader project to increase the impact of official statistics, this report focuses on measuring the use of data by analyzing data portals to gather use metrics. It is a companion to Counting on Statistics: What can data producers and donors do differently to increase use? which provides evidence of the perspectives of national statistical offices and government ministries in low- and middle-income countries on the use of official statistics.
The principal authors of this report were Eric Swanson and Amelia Pittman of Open Data Watch with advice and assistance from Shaida Badiee, Tawheeda Wahabzada, Deirdre Appel, Caleb Rudow, Jamison Crowell. William Hensley assisted in the final editing. Francois Fonteneau and Rajiv Ranjan of PARIS21 collaborated in the design of the project and provided invaluable feedback throughout. Geoffrey Greenwell, formerly at PARIS21, provided useful advice in the early stages. Ben Crawford, Jessica Cooper, Autumn Rose of Forum One provided technical support on the analysis of web analytics and the design of the dashboard template.
Table of Contents
Twenty years ago, most developing countries relied on CD-ROMs, DVDs, or paper publications to make their data available to users. Even large statistical offices in high-income countries were likely to use disks or even magnetic tape as their medium for large datasets. Today, dissemination of official statistics takes place on the website of a statistical agency or a third-party website that repackages data obtained through an application program interface (API). Recognizing this important advance in the data revolution, the 2017 Cape Town Global Action Plan for Sustainable Development calls for the development of technological infrastructure for better data dissemination and to expand the use of online methods for the dissemination of SDG statistics.
In the most recent round of assessments for the Open Data Inventory (ODIN), NSO offices in 178 out of 180 countries, representing 99 percent of the world’s population, had a functioning website to distribute official statistics. A survey conducted jointly by Open Data Watch and AidData showed that a majority of staff in government ministries (who are important users of the statistics produced by NSOs) said they preferred to obtain statistical information from online sources. In the same survey, 72 percent of NSO staff and leaders said they are currently using web analytics to monitor use of their websites. Among those not currently monitoring data use, 85 percent identified web analytics as the tool they would prefer to use.
In 2016, PARIS21 published a report that highlighted the proliferation of data portals for disseminating official statistics. On average, they found more than two such sites per country, and in African countries, where donors and international organizations have encouraged their development, there were an average of 3.4 websites. The report raised concerns about the duplication of effort, the confusion among users caused by multiple sites, and “the high costs for demonstrably low usage of these portals.” The study was able to obtain evidence on the use of one country’s data portal. The results were not reassuring. They found that 89 percent of 54,000 monthly visitors to an African NSO website stayed on the home page. Of those that navigated more deeply into the website, 6.5 percent went to a page with public job offers; 4.4 percent downloaded data publications in PDF format; and fewer than .005 percent – approximately 2.5 people each month – went on to a dedicated data portal to download data.
This study was designed to extend the work of PARIS21 by capturing more information about the volumes and patterns of web traffic on NSO websites and data portals. In addition, it provided an opportunity to demonstrate the tools available to NSOs for monitoring their web traffic. Notwithstanding interest expressed by NSOs in web analytics, few of the websites included in this study were set up with highly optimized versions of Google Analytics. As a contribution of the study, we provided recommendations to participating NSOs for adjustments to their implementation of Google Analytics to improve their ability to monitor and analyze web traffic. The recommendations are summarized in Annex 1 of this report. Additionally, we have provided participating NSOs with a custom dashboard that displays key indicators from Google Analytics. For other NSOs that might wish to improve their monitoring efforts, the Diagnostic Toolkit included with this report provides access to a dashboard template that can be copied and connected to web traffic from any Google Analytics account.
Study Participation and Data Collection
Web analytics refers to the collection of data about users and their behavior on a website. Data may be collected in log files maintained by the server that hosts the website, but more commonly, the record of visits to the site is created by “tags,” script files that are encoded in the site and activated when a user visits a web page. The data generated can include the location of the visitor; whether the same visitor (using the same computer and web browser) has visited the site before; how long the visitor remained on a specific web page and where they went next; what search terms they employed while on the site; and other details of the way they interacted with the website content. The quality of the data produced depends on how well designed the website is, the placement of the tags, and how the analytics software, in this case Google Analytics, has been set up. While there are many vendors that provide web analytic services, Google, in addition to being a free service, is the most well-known and widely adopted analytics software available.
The countries invited to participate in this study were selected from among low- and middle-income countries. The data collected from Google Analytics for the study cover the period October 2017 through September 2018. Because of the need to obtain historical data, only NSOs that had previously installed Google Analytics on their statistical websites were included. Countries in Africa were given preference, but countries from Latin America, Europe, and Asia were also included. Eight countries were invited to participate in this study, selected based on past relationships each one had with PARSI21 and Open Data Watch. Seven agreed to participate in the study and to grant access to the Google Analytics account installed on their websites so that Open Data Watch could directly access web traffic data. In addition to the seven NSO websites, three countries provided access to data portals. All participating countries agreed to allow their anonymized data to appear in this summary report, which provides cross-country analysis.
The data available on the seven NSO websites and on two of the data portals are primarily statistical indicators compiled from censuses, surveys, and administrative data. One data portal is a specialized archive of microdata containing documentation of surveys and census records. Although the sample of countries and websites is not representative of all countries or even of countries in their regions, some of the observed patterns of use are likely to be commonly found.
Forum One, which served as advisor on web analytics, conducted further analysis prior to the beginning of the study to identify potential issues of website design that might impede analysis. The review of the websites showed that most were not set up to filter traffic generated by bots — automated programs that search for and download information from the internet — and most had not enabled site search tracking, which reports search words used by visitors. To improve data collection, participants were advised on how to filter bots and set up site search tracking. This outreach resulted in a total of seven of the websites (from five countries) with enabled search tracking and bot filtering. However, three websites experienced difficulties that will require further efforts to implement changes. uidance for resolving these issues is provided in the instruction package included as an annex of this report. Unfortunately, none of the websites had implemented tracking of data downloads. One NSO required extensive assistance in resolving issues related to granting access to Google Analytics. To not overburden NSO staff, no further changes in their website were suggested.
In some cases, the data appearing in this report differs from raw Google Analytics reports because extraordinary traffic from the Reuters network has been excluded. The initial analysis of the geographic origins of users showed that the United Kingdom was one of the largest sources of web traffic on some websites, even surpassing traffic originating in the country itself. However, the bounce rate for this traffic was high, in some cases over 90 percent, suggesting that whoever accessed the site from the Reuters network left without interacting with the page or navigating to another page. The unusual activity took place from July to November 2017 and originated from London. Further analysis revealed that it came from the Reuters network. Although it is possible to identify the national origin of most visitors, IP addresses are not reported by Google Analytics, so individual visitors cannot be identified. However, when traffic originates from a network that is clearly labelled, as in the case of Reuters, it is possible to create filters that trap the traffic from that source. Why Reuters generated so many web visits is not known, and the pattern was not familiar to the web analytics experts at Forum One. To keep the data analysis from being skewed, Reuters web traffic was excluded.
Key Performance Indicators
Overview: Measuring web analytics sounds complicated – but useful and actionable data can be easily understood by looking at a few basic metrics.
Basic indicators of website traffic, including the number of users visiting the site, the number of sessions that the users have initiated, the number of page views occurring through those sessions, and the bounce rate provide useful overview of site traffic. This section analyzes these indicators for each national statistical office (NSO) website and data portal included in the study and can be seen in Table 1.
The number of users shows how many uniquely identified individuals have visited the site during the studied period, although users accessing the site with multiple devices may be counted more than once. Considerable variation exists between the sites. Websites that belong to countries with larger populations tend to receive a larger number of visitors. The seven NSO sites received an average of 1.1 million users with a range of 4.2 million at the highest and 91,000 at the lowest. NSO sites receive much higher web traffic activity than data portals. The three data portals received an average of 39,265 users with a range of 59,000 at the highest and 3,900 at the lowest. An NSO site serves many functions in addition to providing data, while the dedicated data portals serve the sole function of providing access to data, which limits the type of users that would seek to access the site.
The number of sessions shows how many times users loaded the website, and the number of page views shows how many times the pages throughout the site were accessed. The seven NSO sites received an average of 2.0 million sessions with a range of 6.5 million at the highest and 177,000 at the lowest. They received an average of 5.7 million page views with a range of 15.9 million at the highest and 544,000 at the lowest. The three data portals received an average of 71,000 sessions with a range of 122,000 at the highest and 7,000 at the lowest. They received an average of 527,000 page views with a range of 1.0 million at the highest and 30,000 at the lowest. While the number of users provides an understanding of how many people accessed the site, the number of sessions and page views represents the frequency and depth at which the website was accessed.
The bounce rate shows the proportion of users that arrive on a website page and immediately leave without going farther into the site. The bounce rate may capture several different scenarios: users may arrive on the website by accident; they may be checking for an update; or they may be interested only in the information on that page. The seven NSO sites experienced an average bounce rate of 47.4 with a range of 56.0 at the highest and 39.7 at the lowest. The three data portals experienced an average bounce rate of 34.1 users with a range of 45.5 at the highest and 25.2 at the lowest.
Table 1: Number of users, sessions, page views, and the bounce rate
|Website||Users||Sessions||Page views||Bounce Rate|
|Country 1 – NSO Site||4,182,770||6,464,157||15,915,032||49.0|
|Country 1 – Data Portal||58,507||84,131||531,738||31.6|
|Country 2 – NSO Site||925,329||1,564,049||4,882,915||48.0|
|Country 3 – NSO Site||231,897||545,007||1,820,504||39.7|
|Country 3 – Data Portal||55,387||121,837||1,017,674||25.2|
|Country 4 – NSO Site||320,576||524,435||1,355,585||47.1|
|Country 5 – NSO Site||2,031,370||4,183,264||14,809,298||46.8|
|Country 6 – NSO Site||104,450||177,102||613,766||47.6|
|Country 6 – Data Portal||3,901||6,989||30,379||45.5|
|Country 7 – NSO Site||91,439||184,933||543,570||56.0|
|Average – NSO Sites||1,126,833||1,948,992||5,705,810||47.7|
|Average – Data Portals||39,265||70,986||526,597||34.1|
Table 2 shows the number of sessions by their duration. Some users access the website only briefly (0-10 seconds), indicating use without exploration. Others explore the website for a few minutes (11-180 seconds), indicating moderate exploration. And some access the website for extended periods of time (180 seconds or more), indicating in-depth use. However, some of these users may have opened the website and forgotten to close it, resulting in inflated duration times.
Table 2: Number of sessions by duration
|Website||0-10 seconds||(%)||11-180 seconds||(%)||181 seconds +||(%)|
|Country 1 – NSO Site||3,471,177||53.7||1,501,052||23.2||1,491,928||23.1|
|Country 1 – Data Portal||29,498||35.1||29,504||35.1||25,129||29.9|
|Country 2 – NSO Site||828,726||53.0||338,935||21.7||396,388||25.3|
|Country 3 – NSO Site||278,585||51.1||157,231||28.8||109,192||20.0|
|Country 3 – Data Portal||49,528||40.7||38,240||31.4||34,069||28.0|
|Country 4 – NSO Site||269,305||51.4||139,074||26.5||116,056||22.1|
|Country 5 – NSO Site||1,329,683||50.6||655,784||25.0||641,927||24.4|
|Country 6 – NSO Site||90,623||51.2||45,320||25.6||41,159||23.2|
|Country 6 – Data Portal||3,348||47.9||1,636||23.4||2,005||28.7|
|Country 7 – NSO Site||106,719||57.7||32,875||17.8||45,339||24.5|
|Average – NSO Sites||910,688||52.7||410,039||24.1||405,998||23.3|
|Average – Data Portals||27,458||41.2||23,127||30.0||20,401||28.8|
For the seven NSO sites, the majority of sessions lasted 10 seconds or less. On average 52.7 percent of sessions lasted for 0-10 seconds; 24.1 percent of sessions lasted for 11-180 seconds; and 23.3 percent of sessions lasted for more than 181 seconds. For the three data portal sites, 41.2 percent of sessions lasted for 0-10 seconds; 30.0 percent of sessions lasted for 11-180 seconds; and 28.8 percent of sessions lasted for more than 181 seconds. This shows that on average, users explored for a longer time on the data portals sites than on the NSO sites, suggesting greater depth of usage. Figure 1 visualizes the distribution of sessions on each site.
Figure 1: Distribution of sessions by duration
The number of pages viewed during a session provides a complementary measure of the depth of exploration. Exploration of one page indicates sessions during which a user clicked to only one other page before exiting the site. Sessions involving two to four pages indicate moderate exploration of the site. Sessions that involved five or more pages reflect in-depth exploration of a site content. As Table 3 shows, for the seven NSO sites, an average of 49.5 percent of sessions involved exploration of one page, while an average of 32.8 percent of sessions involved two to four pages, and an average of 17.7 percent of sessions involved five or more pages. For the three data portal sites, an average of 35.7 percent of sessions involved exploration of one page, while an average of 27.3 percent of sessions involved two to four pages, and an average of 37.1 percent of sessions involved five or more pages. Figure 2 shows the variation in the proportion of in-depth website usage across the various websites.
Table 3: Number of sessions by page depth
|Website||One or fewer||(%)||
Two to four
|(%)||Five or more||(%)|
|Country 1 – NSO Site||3,725,855||57.6||1,961,199||30.3||777,104||12.0|
|Country 1 – Data Portal||30,539||36.3||21,235||25.2||32,357||38.5|
|Country 2 – NSO Site||750,478||48.0||530,901||33.9||282,670||18.1|
|Country 3 – NSO Site||216,163||39.7||215,026||39.5||113,819||20.9|
|Country 3 – Data Portal||30,687||25.2||35,222||28.9||55,928||45.9|
|Country 4 – NSO Site||249,230||47.5||202,954||38.7||72,252||13.8|
|Country 5 – NSO Site||1,958,892||46.8||1,362,422||32.6||861,950||20.6|
|Country 6 – NSO Site||89,632||50.6||48,442||27.4||39,028||22.0|
|Country 6 – Data Portal||3,178||45.5||1,931||27.6||1,880||26.9|
|Country 7 – NSO Site||103,593||56.0||50,675||27.4||30,665||16.6|
|Average – NSO Sites||1,013,406||49.5||624,517||32.8||311,069.7||17.7|
|Average – Data Portals||21,468||35.7||19,463||27.3||30,055.0||37.1|
Figure 2: Distribution of sessions by page depth
Table 4 shows the average of duration of sessions by page depth. Across NSO sites, one or fewer pages are viewed for an average of 3 seconds. Sessions that involve an exploration of two to four pages last for an average of 4 minutes and 9 seconds. And sessions that involve five or more pages last for an average of 14 minutes and 17 seconds. Across data portals sites, one or fewer pages are viewed for an average of 0 minutes and 5 seconds. Sessions that involve an exploration of two to four pages last for an average of 2 minutes and 58 seconds. And sessions that involve five or more pages last for an average of 11 minutes and 57 seconds.
Table 4: Session duration by page depth
|Website||One or fewer||Two to four||Five or more|
|Country 1 – NSO Site||0:00:13||0:05:13||0:16:59|
|Country 1 – Data Portal||0:00:14||0:03:33||0:10:02|
|Country 2 – NSO Site||0:00:01||0:04:09||0:15:26|
|Country 3 – NSO Site||0:00:01||0:02:31||0:10:56|
|Country 3 – Data Portal||0:00:00||0:01:28||0:10:27|
|Country 4 – NSO Site||0:00:01||0:03:44||0:12:45|
|Country 5 – NSO Site||0:00:01||0:04:11||0:15:23|
|Country 6 – NSO Site||0:00:05||0:04:14||0:12:58|
|Country 6 – Data Portal||0:00:00||0:03:53||0:15:21|
|Country 7 – NSO Site||0:00:01||0:05:04||0:15:32|
|Average – NSO Sites||0:00:03||0:04:09||0:14:17|
|Average – Data Portals||0:00:05||0:02:58||0:11:57|
Variation over time: Most websites experience seasonal variation with lower traffic during holiday periods and higher traffic with new data releases and publications.
Understanding the patterns of variation that a website experiences will help provide context for web traffic data viewed in a specific month. Figure 3 shows that most websites experience a reduction in the number of sessions during the months of December and January. This may reflect reduction in working hours during holidays or schools being out of session. Other drops may reflect issues such as website down time.
Viewing how the number of sessions rise and fall across the months is also an effective way to view the success of dissemination efforts. After releasing key statistical publication there should be an increase in the number of sessions, reflecting the activity of users seeking to access the new publication. Some of the months during which the overall web traffic is higher are the result of sharp spikes in user activity during a few days that may coincide with data releases or other events.
Figure 3: Number of sessions per month
Audience engagement channels: NSO websites were most often reached by organic search; data portals were accessed directly or by referral from another website.
Engagement channels record how users find their way to a given website: through a search engine, a bookmarked link, a link shared through social media, or other routes. Table 5 provides a breakdown of the channels through which sessions begin on each website. Figure 4 further illustrates the variation in engagement channels.
Across the seven NSO sites, an average of 66.4 percent of sessions began through organic searches. This indicates sessions initiated by users who found the NSO website using keywords through a search engine such as Google or Bing. There are many users who know the website name, but not the address and therefore access it through a search engine. For the three data portals, an average of 14.2 percent of sessions began through organic searches.
Many sessions also began through direct channels when a user typed the URL into the browser or clicked on a bookmarked link. For the seven NSO sites, an average of 26.1 percent of sessions began through direct channels, and 54.0 percent of sessions for the three data portals. Accessing a site through direct channels suggests more purposeful usage, because the user must already know how to access a given site, whereas accessing a site through organic search suggests users less familiar with how to access a site.
Sessions that began through referral involve users that clicked on a link from another website to reach the NSO website. This does not include links through major search engines, which are counted as organic searches. On average, data portal sites have a much higher percentage of referrals than NSO sites. For the seven NSO sites, an average of 6.4 percent of sessions began through referrals, while 30.7 percent of sessions on the three data portals began through referrals. This most likely a result of data portal users being referred through the country’s NSO site.
Few sessions began through social, email, or other channels. Of those that did, most came from social media, such as Facebook or Twitter. The “other” channel refers to sources that Google Analytics is not able to classify. The fewest number of sessions began through email, which may be the result of limited email outreach from NSOs or because of the correct Google Analytics tags that would report data were not included in emails.
Table 5: Number of sessions by channel grouping
|Country 1 – NSO Site||14.3||0.0||79.5||0.0||5.0||1.3|
|Country 1 – Data Portal||65.9||0.0||31.2||0.0||0.7||2.2|
|Country 2 – NSO Site||21.7||0.0||73.3||0.0||3.9||1.1|
|Country 3 – NSO Site||29.3||0.0||61.3||0.0||7.8||1.7|
|Country 3 – Data Portal||11.0||0.0||0.5||0.0||87.5||1.0|
|Country 4 – NSO Site||29.0||0.0||64.4||0.5||4.4||1.7|
|Country 5 – NSO Site||19.2||0.0||63.8||0.0||16.7||0.4|
|Country 6 – NSO Site||31.8||0.0||64.5||0.0||3.0||0.7|
|Country 6 – Data Portal||85.1||0.0||11.0||0.0||3.7||0.1|
|Country 7 – NSO Site||37.5||0.0||57.7||0.0||4.1||0.7|
|Average – NSO Sites||26.1||0.0||66.4||0.1||6.4||1.1|
|Average – Data Portals||54.0||0.0||14.2||0.0||30.7||1.1|
Figure 4: Percent of sessions by channel grouping
Geographic origins: Across all sites, users most often come from within countries hosting the websites.
As Table 6 shows, for the seven NSO sites, the average percentage of users coming from within the country is 73.4, while for the data portals, the average is 65.4. The only instance in which less than half of web traffic came from domestic sources was the NSO site for country 6. This high rate of international web traffic suggests that the country may have lower domestic capacity to access data and or receives support from an especially high number of international sources, resulting in their need to access data. Figure 5 shows the variation in the percentage of domestic users across the NSO websites and data portals.
Table 6: Number of domestic users and sessions
|Website||Domestic Users||Proportion of all users(%)||Domestic Sessions||Proportion of all sessions(%)|
|Country 1 – NSO Site||3,489,081||83.0||5,477,639||84.7|
|Country 1 – Data Portal||49,328||83.8||70,734||84.1|
|Country 2 – NSO Site||773,515||83.9||1,377,374||88.1|
|Country 3 – NSO Site||146,039||62.5||431,475||79.2|
|Country 3 – Data Portal||34,569||62.3||81,532||66.9|
|Country 4 – NSO Site||265,977||82.4||434,597||82.9|
|Country 5 – NSO Site||1,772,764||87.5||3,799,943||0.8|
|Country 6 – NSO Site||50,443||49.2||104,252||58.9|
|Country 6 – Data Portal||1,988||50.0||3,636||52.0|
|Country 7 – NSO Site||60,211||65.1||134,979||73.0|
|Average – NSO Sites||936,861||73.4||1,680,037||79.7|
|Average – Data Portals||28,628||65.4||51,967||67.7|
Figure 5: Number of domestic users
Further analyzing the geographic origins of users may yield additional insights regarding user characteristics. It is even possible to view the data by city rather than by country. Unfortunately, providing a full list of the top international sources of web traffic to each site would compromise the anonymity of the countries participating in this study because most countries received a great deal of web traffic from neighboring countries, regardless of the income level of these countries.
The analysis of the geographic origins of users revealed the irregularity in the data stemming from the Reuters network. The raw data from Google Analytics for most websites showed the United Kingdom as providing the second highest number of users, with a bounce rate of more than 90 percent in most cases. Further analysis revealed that this was due to an unusual increase in activity coming from the Reuters network from July to November 2017, almost all of which immediately left. It is not clear what this spike in user activity from Reuters means. To provide data that show the behavior of actual users, the web traffic data from the Reuters network has been filtered out of the analysis.
Overview: Web traffic analysis provides evidence that websites and data portals are being used to access a wide variety of statistical data.
Are website visitors looking for data and do they find what they are looking for? Without knowing the content of every page, it is impossible to draw a definitive conclusion, but web traffic data provides useful insights.
To estimate the share of traffic going to data pages on NSO websites, the study examined the 30 pages receiving the most traffic, not including the home page. The number of page views these top 30 pages received reflected 24 to 84 percent of all page views on the site. This shows that even in cases of the smallest range of percentages, the 30 pages still reflect a significant portion of web traffic. See table 7. Among the top 30 pages on each site, the share of page views on data-related pages ranged from 18 to 97 percent. NSO sites with a high percentage of data pages viewed had a more specialized focus on providing access to statistics, functioning much like a data portal. But other sites provided other services that attracted a large share of total traffic, such as access to official records, information on employment opportunities, and forms for uploading data. (Although these are certainly data-related, in this analysis we only considered pages used to disseminate data.) These percentages based on the 30 most heavily visited pages are not necessarily representative of all traffic on the sites, but they confirm that along with the less heavily used data portals, the NSO sites are important vehicles for data dissemination.
Table 7: Share of page views among top 30 pages
|Website||Total page views excluding home page||Page views of top 30 Pages||Top 30 share of all page view (%)||Top 30 page views for data pages||Share of top 30 page traffic going to data (%)|
It was a challenge to identify the volume of web traffic directed solely to data-related pages because not all sites used a clear directory structure in their URLs. A clear directory structure on a website organizes all data under /statistics/ or a similar key word in the URL, then organizes them under categories and themes using descriptors in the URLs that describe their contents, such as, /statistics/education/ or /statistics/health/. On many of the sites, the page URLs are differentiated only by a serial number, necessitating a closer examination to determine their contents. Another challenge to the analysis of data use is that none of the websites participating in this study were set up to track data downloads. Moreover, some websites have been designed using iframes to provide access to data, which make it impossible to ever track data downloads.
The contents of the top 30 pages were identified by examining each page and categorized to show trends in data access by page contents. Table 8 shows the results. Many pages provide access to publications such as statistical yearbooks that contain a variety of statistics. These were classified as “general statistics.” “Economic statistics” include national accounts, price indexes, employment statistics, and foreign trade data. “Social statistics” include demographic data, poverty statistic, and health and education data. The pages classified as “environment statistics” included agricultural statistics, geographic information or maps. Pages in the top 30 that did not provide data, were classified as “other.” With the exception of Country 2, pages with economic statistics were the most often visited among those pages that could be classified. No pages were found that included environmental indicators such as land use data, pollution indicators, or indicators of the built environment. This may reflect the interest – or lack of interest – of website visitors but may also reflect a lack of data available on the websites.
Table 8: Share of data pages by contents (%)
|Country 1||Country 2||Country 3||Country 4||Country 5||Country 6||Country 7||Average|
Besides the number of page views, other measures may be useful for gauging user interest in these pages such as the exit rate, bounce rate, and time spent on page. The interpretation of these measures and their usefulness in assessing how users respond to the site differs depending on the nature of the page. A low exit rate is good for pages that guide users to access data on pages located elsewhere on the site, because it means that users most likely continued on to access those pages. A higher exit rate on a page that provides direct access to data may mean that users have discovered the data they need and left the site with their needs satisfied. A high bounce rate on a page providing data could indicate that users have a that page bookmarked and regularly check it for updates. More time spent on a page could be positive if it includes a great deal of information, but it could also indicate a page that users find difficult to navigate.
Another way to identify pages that are important to data users is to look at entry and exit data for the overall site. Data-related pages on which higher numbers of sessions begin suggest that users arrived at the site searching specifically for those data. Exit rates may indicate where users gave up on trying to find what they needed, but they also indicate where users were able to find what they were looking for, which helps NSOs identify types of data that are in the highest demand.
Data search activity: Site search analytics provides further evidence that users are most interested in economic data.
Although only a small segment of users accesses the search functionality of a website, analyzing web traffic related to search activity can provide valuable insight into the data-related needs of users. Keywords that users have typed in order to find data are a direct message from users asking for those data. They also suggest which types of data may be difficult to find, leading them to the use of the website’s search functionality in the first place.
Only four of the ten websites included in this study had search data to analyze. Although the availability of website search activity is limited, it is possible to carry out some initial analysis. The data in Table 7 show that few sessions involved the use of the site search functionality. The average percent of sessions that involved site search on the sites below is 1.9. Visitors viewed a search result page 1.5 times after performing a search. Following a search, the average number of pages viewed was 4.6 with 2.2 being the lowest number of pages and 7.7 being the highest. The highest number came from the data portal for which search data were available. The average duration of a session following a search was 6 minutes and 6 seconds, which suggests that sessions that involve site search correspond to more in-depth exploration of the website. The exit rate of 20.9 percent may reflect users who upon viewing the results did not find what they were looking for and gave up searching.
Table 9: Website search activity
|Website||Sessions with search||Share of all page views (%)||Page views/ search||Average search depth||Time after search||Exit rate (%)|
|Country 1 – NSO Site||3,615||0.1||1.3||3.7||0:06:43||18.4|
|Country 3 – NSO Site||6,214||4.2||1.1||2.2||0:03:55||18.6|
|Country 3 – Data Portal||1,236||3.4||2.4||7.7||0:05:57||19.8|
|Country 5 – NSO Site||6,426||0.2||1.1||4.6||0:07:48||26.6|
|Average – All||4,373||1.9||1.5||4.6||0:06:06||20.9|
Note: For Country 1 and Country 5, search data are available October 1, 2017 to September 30, 2018, and for Country 3, data are available May 17, 2018 to September 30, 2018.
The data-related search terms from the four websites with a full year’s data reflected similar findings from the analysis of the key data pages. Table 10 shows the top five data-related searches for each website. The exact terms have been adjusted to remove any words that might identify the website or country to preserve anonymity. On the Country 1’s NSO site, “calendar,” which leads to release-dates for data, was the most popular search term, accounting for 15.4 percent of all searches. For the Country 3’s NSO Site and Data Portal, the most popular search term was GDP, accounting for 1.9 percent and for 0.9 percent of searches respectively. For Country 5‘s NSO Site, the most popular search term was GDP, which accounted for 4.1 percent of searches. Most data-related search terms refer to economic statistics such as GDP and foreign trade. There is additional interest in data-release dates and data on poverty and population.
Table 10: Data-related search terms
|Website||Search Terms||Number of Searches||Percent of Searches|
|Country 1 – NSO Site||Calendar||701||15.4|
|Population by Age||13||0.3|
|Country 3 – NSO Site||GDP||87||1.9|
|Country 3 – Data Portal||GDP||43||0.9|
|Country 5 – NSO Site||GDP||186||4.1|
Data Collection Challenges: This study identified common mistakes in the implementation of Google Analytics that, in some cases, made it difficult or impossible to draw definitive conclusions.
As noted earlier, NSO websites offer more diverse content than data portals. Because of that, some users of an NSO website may ignore statistical offerings and look for news announcements, job openings, or other services provided by the NSO online. Tracking patterns of activity on NSO websites that lead to data pages can be difficult because most sites provide multiple paths for accessing data, with some hosting data in different formats at separate locations.
The task is made more difficult when the website is not structured around easily identified landing pages or when the page URLs are made up of unrecognizable codes. A clear directory structure on a website organizes all data under a recognizable keyword, such as /statistics/ or a similar keyword in the URL, then organizes them under categories and themes using descriptors in the URLs that describe their contents, such as, /statistics/education/ or /statistics/health/. On many of the sites, the page URLs are differentiated only by a serial number, necessitating a closer examination to determine their contents.
Most analytical features on Google Analytics come automatically enabled but tracking data downloads and the search activity of users must be separately enabled. None of the countries in the study were able to implement download tracking. Moreover, some websites have been designed using iframes to provide access to data, which make it impossible to ever track data downloads.
Some participating countries already had site search tracking enabled and could provide more than a year of data for analysis while other countries were encouraged to enable this functionality. In most cases, enabling search functionality simply requires clicking a button under the settings of Google Analytics. However, there are a few instances where default search parameters are not included within the URL structure. As a result, some adjustment is needed in Google Analytics settings for tracking site searches. In some cases, custom coding may be needed to successfully enable this feature.
If the first rule of communication is “know your audience,” then web analytics should form the foundation for NSOs’ digital dissemination strategies. The analytic tracks left by website users provide evidence of what users are looking for and what they found. To better anticipate what users need, NSOs should consider using feedback forms, user surveys, focus groups, and advisory councils complemented by web traffic data. Changes made in response to user demands should always be reflected in the data. An effective web analytics and optimization program creates a robust feedback loop that connects data providers, content creators, website managers, and IT staff back to their users. This study reaffirms our belief that NSOs should view web analytics as a vital resource available to them to help disseminate data effectively.
The study also confirmed that NSO websites and data portals themselves are important vehicles for data dissemination. In our analysis of the 30 pages receiving the most traffic, not including the home page, 18 to 97 percent of those pages offered data-related content, while these pages represented 24 to 84 percent of all page views on the sites. This clearly demonstrates that NSO websites and data portals are effectively disseminating data to users who seek it.
Volumes of traffic ranged from 3 million on the most heavily used NSO website to 3 thousand on the least heavily used microdata portal. The number of visitors to NSO websites tends to be proportionate to the population of the country, which is consistent with the observation that a large share of the traffic to the websites originates in the country (exact proportions are not given because they could breach the confidentiality agreement with participating countries.) Across all sites, users accessing the websites most often come from within the country hosting the website. International visitors come predominantly from neighboring countries or from countries that have traditional ties of trade or aid flows with the host country. The only website and data portal for which domestic users made up less than two-thirds of website visitors were located in a small country receiving extensive donor support.
NSO sites often serve many functions in addition to providing data, which leads to users with a variety of needs from and expectations of the sites. Conversely, data portal sites serve a singular purpose and have a more limited type of user engaging with the sites. The study discovered that data portals had lower bounce rates than the NSO websites, suggesting more deliberate access of their content. On average, users explored data portal sites for longer and with greater page depth than NSO sites, which suggests users have greater interest in exploring their contents, or that it may be more difficult for users to find the content they are looking for. The data comparing the number of sessions by page depth also confirm that, on average, users explored not only for a longer time on the data portals sites, but also with greater depth.
The variation in the number of sessions over time reveals traffic changes due to seasonal patterns, important content updates, and other external factors. The decrease during the months of December and January may reflect reduction in working hours or school attendance, as many individuals observe holidays at that time of year. Decreases may also reflect periods of time when the website was down. Increases may reflect releases of key statistical publications. Website managers should consider external factors such as these when assessing data from web analytics to gauge the impact of changes in their websites and their contents.
The study had difficulty determining the proportion of visitors sought and found data on both the NSO and data portal sites. Most sites provide multiple paths to accessing data, with some hosting data in different formats at separate locations throughout the site. It was a challenge to separate the web traffic to these pages because most sites did not provide a clear directory structure in their URLs. To help find the data-related sections of NSO sites, the Open Data Inventory (ODIN) was used to identify specific pages containing key data-related content. These included landing pages for publications or sections dedicated to providing statistics. Although this may not include all possible data-related pages, analyzing pages included in the ODIN assessments shows web traffic for pages that provide access to highly sought data.
Economic indicators were the most in demand content among website users. Pages containing employment statistics were among the most frequently accessed on the NSO sites. Foreign trade indicators, including export and import data, were also frequently accessed. Other frequently accessed economic indicators included GDP data and consumer price indices. Among social indicators, population data was the most frequently viewed, in some cases more frequently than economic data. Other social indicators concerned poverty, education, and health. The search terms from the websites with search tracking led to similar results. Among the most popular data-related search terms were “GDP,” “inflation,” “exports,” “population,” and “poverty.”
Improvements to the structure of websites and implementation of website analytics will increase the value of the data captured by web analytics. NSO managers and their website teams should use this information, along with other avenues for feedback, to respond to user demands for more and better data and to update and improve their websites and data portals accordingly.
Click the links below to continue to all Annexes or a specific Annex:
Click footnote number to return to its location in the text.
 Counting on Statistics. https://aiddata.org/publications/counting-on-statistics
 Cape Town Global Action Plan for Sustainable Development Data. https://unstats.un.org/sdgs/hlg/cape-town-global-action-plan/
 Counting on Statistics. https://aiddata.org/publications/counting-on-statistics
 Greenwell, Geoffrey, et. al. Making Data Portals Work for SDGs: A view on deployment, design, and technology. Discussion Paper No. 8. PARIS21. 2016. http://paris21.org/sites/default/files/Paper_on_Data_Portals%20wcover_WEB.pdf
 There is no practical difference between a website and a data portal, except that the latter may be expected to focus more specifically on providing access to data, while the general-purpose websites maintained by statistical offices often include other services and news reports.