I am in Rolle, Switzerland, on the beautiful Geneva lake, getting ready for a speech on Science 2.0 to the European Astronomical Society. As usual, travelling makes me read, and think. In this case, a great paper on the reuse of scientific data: If We Share Data, Will Anyone Use Them?
One of the topic I am interested in is reuse of open data. In the domain of open government, the current EU eGov action plan one of the key actions is on indicators for PSI reuse. This is critical: after many years fighting to have open government data, we now need to show they are actually getting used and reused. Just as for online public services, there is a sense of disappointment with the low rate of open data reuse, typically measured by number of downloads of datasets or number of users downloading datasets. Somehow there was the expectation that citizens would rush to play with government data, once they became available.
In my opinion, this is a mistaken expectations. Citizens by far are not interested in government data, and certainly not in directly manipulating them. What matters is not how many people download them, but what do with it the few people who care. It does not matter if spending data are downloaded by few people: what matters is that among those few, someone is building great apps and services, used by millions, generating social and economic benefits.
Based on the literature on eGovernment, we somehow expect that UPTAKE indicators anticipate IMPACT indicators. If you have few users downloading, you expect the impact to be low, and viceversa. But the reality is that the success stories of open data happen when “data meet people”, when the right people come across the right data. When it comes to innovation, uptake is not a proxy for impact. What matters is not how many, but who. Number of downloads and number of users should not be taken as headline indicators to measure the impact of open government.
The same is true in science. Publishing scientific data will not lead to thousands of scientists replicating the findings of other scientists. But we know from the Rheinhart Rogoff case that we simply need one student to reuse the data in order to achieve a huge impact, in this case to uncover the mistaken evidence behind the most important economic decisions of our time.
An Open Strategy, in any domain, should not be aiming to generate massive participation, but at enabling and facilitating the job of those few that actually care about them. That’s design for serendipity.
Findability of the data is key and this is why metadata and standards are crucial to grasping the benefits of open data. Because they facilitate the serendipitous encounter of the right people with the right data.