October 10, 2012

Ignorance surveys

In the previous post I was sceptical about the importance of young Kiwis not being on first-name terms with their zucchini.  A post by Mark Liberman on Language Log suggests that ignorance surveys are even worse than I realized.

There’s the well-known problem of misreporting:

A new survey conducted by Chicago’s McCormick Tribune Freedom Museum, which has yet to open, finds that only 28 percent of Americans are able to name one of the constitutional freedoms, yet 52 percent are able to name at least two Simpsons family members.

when in fact the figure in that survey was 73%, not 28%.  What’s new is how bad the coding of responses can be even in very respectable surveys

The way it works is that the survey designers craft a question like this one (asked at a time when William Rehnquist was the Chief Justice of the United States):

“Now we have a set of questions concerning various public figures. We want to see how much information about them gets out to the public from television, newspapers and the like….
What about William Rehnquist – What job or political office does he NOW hold?”

Answers scored as incorrect included:

Supreme Court justice. The main one.
He’s the senior judge on the Supreme Court.
He is the Supreme Court justice in charge.
He’s the head of the Supreme Court.
He’s top man in the Supreme Court.

Mark Liberman concludes

  • When you read or hear in the mass media that “Only X% of Americans know Y”, don’t believe it without checking the references — it’s probably false even as a report of the survey statistics.
  • When you read survey results claiming that “Only X% of Americans know Y”, don’t believe the claims unless the survey publishes (a) the exact questions asked; (b) the specific coding instructions used to score the answers; (c) a measure of inter-annotator agreement in blind tests; and (d) the raw response transcripts.

That might be going a bit far, but at least (a) and (b) are really important. If you call the elongated green vegetable a ‘courgette’, is that scored as right or wrong? What if you are from the US and call it ‘summer squash’, or from South Africa and (according to Wikipedia) call it ‘baby marrow’?

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »