by Reza Farivari
What makes an open data site special, desirable, and impressive? In my opinion, the first is what the site contains. How relevant and useful are the data? Are they clearly identified and well documented? Are they up-to-date? Unless the data are meant to be stagnant (for example, the findings of a research study) or archived, and labeled as such, most users want data to be very current. As my old boss used to say, “Data are like fish, the older they are, the less desirable.”
Once the user has identified the data he or she wants, their concern is how to retrieve them. Here is where I would like to focus. I have seen a number of sites where the content is very interesting and relevant but getting to the data is cumbersome. This was understandable in the days of mainframes, client-server environments, and CD ROMs. Then we could only provide one-size-fits-all service, but websites today should be flexible enough to cater to different users’ needs and provide smart views of different types of data.
In my experience, there are no “typical” users. There are users who just want backdoor access to data and have no interest in the user interface (UI). For those users, we can have application programming interfaces (APIs) and pre-extracted bulk downloads. There are those who wish to peruse the data and analyze it in the environment of the site itself. For them we need an intuitive UI, good search and query features, and data visualization and extraction capabilities. With more people using mobile platforms – phones and tablets — we must cater to them as well. All these users may want access to the same datasets but access them following different paths.
On some sites, not all data are available to all users. Users do not know this until they try to access the data, at which time they see the message “You do not have access to this dataset.” It would be best to inform users that special permission is needed earlier in the process.
On open data sites, we should anticipate that there will be a wide range of users, from naïve browsers to experienced researchers and professional data aggregators. There are a number of features that should be included on all open data sites. You may think of more, but for me, this dozen are the most relevant.
- The terms of use are easily accessible and clearly explained.
- Datasets are well documented with sources, footnotes, and metadata.
- Datasets have a creation date and update frequency, and users can be informed of updates.
- Users have a facility to give feedback on data quality and request new data.
- There is a contact person for each dataset as well as a support/help number or link.
- Methods for accessing data are consistent across databases (where data structure allows).
- APIs are available, documented, and provide access to both data and metadata.
- There is a query feature to peruse data before download or initial API access.
- Data can be extracted in various popular “data manipulable” formats, including XML.
- The site, data, and metadata can be accessed by mobile (iOS and Android) phone / tablets.
- The site is visually nice and pleasant, has search functions, and various ways to access its data catalog.
- The site allows for visualization of data in various ways (charts and maps), if appropriate.
And remember that there are many non-English speakers who need data.
Establishing an open data website in any form is a step in the right direction. However, simply posting data without considering users’ needs is not enough. Good website design makes a difference. Visitors to open data websites differ in their skill level, their knowledge of subject matter, and their needs. Including the features described above will increases user satisfaction and the success of the website.