November 11, 2013

Data-based journalism

Data journalism can range from ordinary journalism carried out with enough numeracy to use published data, to informative and insightful displays of information, to analysis, searching and linkage that wouldn’t be possible without computers.

The Herald and Keith Ng have an example of the last type: an analysis of New Zealand’s property records to find property owned by MPs but not declared in the Register of Pecuniary Interests. Property ownership records let you find out who owns a particular piece of land. They aren’t set up to let you go the other way and ask what property is owned by a particular individual, but computers can easily solve that sort of problem by brute force. There are other complications, since land might well be owned by a trust, not by the MP personally; these increase the effort, but don’t make it impossible.  It doesn’t seem that the omissions in reporting violate the law — if the MPs were really trying to hide their holdings, they’d do a much better job — but it exposes a loophole.

This sort of search requires doing things to large, probably messy databases, but it also requires manual verification of all the findings, and the involvement of someone who can ask public figures for explanations — if the Herald asks questions and they don’t get answers, that’s news.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »