Ed Fordham is the LibDem candidate for the new seat of Hampstead & Kilburn. He asked Stephen Taylor to cover a meeting in Westminster on publishing government data. Here, in the new spirit of transparency, is his report.
You asked me to attend on your behalf Philip Dunne’s parliamentary roundtable “Uncovering the truth” with the Audit Commission and EURIM, and to report on it. (EURIM aims to brief the next parliament’s MPs, such as you, on information topics.) The event was not covered by Chatham House rules, so you’re reading about it here.
The meeting was not, as you first thought, about the notorious National Identity Scheme. That programme aims to make our lives transparent to an administration that increasingly shields itself from public scrutiny: detention without charge, ‘control orders’, secret inquests and all the other tools now waiting for a plausible tyrant to show up and grasp them.
This meeting was about a project that looks in the opposite direction. Its champion is Sir Tim Berners-Lee, inventor of the World Wide Web, who won the prime minister’s backing for putting official statistics on the Web in usable format. We citizens, it was thought, would unlock their value in ways unimagined by officials. And so it quickly proved. To take one example, shortly after data.gov.uk went online, cyclists mashed data on bike accidents with OS map data to produce maps of black spots to avoid. More on this in an article in the current issue of Prospect: Mash the state.
You might have expected the Civil Service to block data.gov.uk. But Berners-Lee’s prestige and energy, and Brown’s support, carried the day.
The agenda for this meeting raised worries about the project. Are the data good enough? Will the public understand it? Will imperfect data further erode trust in official statistics? Would publication stifle innovation and risk taking? We wondered if Whitehall was belatedly circling the waggons.
If data.gov.uk has enemies, you will be pleased that it enjoyed too much support from attendees for any opponent to break cover. As an MP you will hear arguments that this or that data should be excused from publication. The main arguments were rehearsed here. You will hear them again. They are laid out elegantly in The truth is out there, to be published by the Audit Commission next month.
- Insufficient quality The data contain too many errors or are too out-of-date for public use. This becomes a self-fulfilling prophecy. There is no pressure to correct errors in unpublished data, so they remain too inaccurate to publish. The remedy is publication, warts and all. If better data would be valuable, people will press for them.
- Not fit for purpose The data were collected for a particular purpose and will not support the analyses the public will want to make. Publish. The data can only get better.
- Requires interpretation The data require skilled interpretation and attention to context to be useful; we should publish summaries, not the raw data. Publish. Interpretations will always be contested. Open that contest to public scrutiny.
- Problems with anonymisation Anonymised data have had removed anything that might identify individuals. Until recently there was general agreement on how to anonymise data. Then researchers took fully anonymised NHS data and showed that by cross-referring it to other records they could identify the patients. This is a serious, devil-in-the-details problem without easy answers. We thought we knew how to anonymise data. Now we see anonymisation as more like encryption: breakable given sufficient data and computer time. Sooner or later someone will scandalously de-anonymise data posted on data.gov.uk. Then there will be calls to close it. This needs the kind of attention already given to fending off viruses, worms and other intrusions.
This was explicitly an insider’s meeting: politicians, appointees to commissions, staffers from NGOs: the great and the good. Still I was disappointed that among so much talk of what the public wants or can understand no one was gauche enough to say, “I am a citizen and this is what I want.” I listened in vain to the opening remarks for acknowledgement that parliaments were called only as kings reluctantly accepted restraint in return for revenue; that democracy began only when newspapers opened government action to public scrutiny. Every administration resists; access to data will always be contested.
Briefing papers for the meeting include a handsome discussion paper from the Audit Commission. Page 13 sports a revealing table of users and uses of public-sector data. It identifies as users: professional and frontline staff; service managers; corporate managers, directors and members; national government and regulators – and citizens. An impressive array of uses is tabulated for the users – except citizens. Apparently we use it for “choices about services” and “democracy”. Where are the innovators who pounced on data.gov.uk and mapped bicycle accident black spots?
Perhaps the charitable view would be that the great and the good know they have no idea what we will do with this – and neither do we yet. That is, after all, the point.