March 5, 2012

Stats Crimes

Thank you for all the suggestions for Stats Crimes (feel free to continue to add your thoughts there too).

We thought you might be interested in a little insight into what the Department of Statistics at the University of Auckland came up with as their pet peeve Stats Crimes during their annual staff retreat recently.

Here’s just a few of them, I’ll post more soon:

Random vs ‘random’

Talkback radio and others not understanding that their listeners and viewers (audience) are not a random population, however ‘random’ they may be.

Definitions and changing definitions

Defining the “measure” they are talking about e.g. poverty, quality of life, employment data. Not stating that “measures” have changed – i.e. definition or classification. Often changes in definition or classification is part of the reason for an increase e.g. autism.

Not really that amazing

Coincidences are more likely than you think.

Association vs causation

Causal headlines from observational studies. “People who are poor watch too much TV. Get rid of TV no more poor people!” (Made this up.)

Bad graphs

Pie charts should be outlawed as comparison of size (areas) are impossible – should use barcharts. 3D graphs should also be outlawed. Chart junk. Uninformative graphics.

Comments

  • avatar
    Thomas Lumley

    Using totals where averages would be more appropriate. Two recent examples are on animal experimentation (Think of a number, then double it), and on school fees (Think of a number, then multiply by four)

    12 years ago

  • avatar
    Ben Brooks

    Using mean when median would be much more meaningful.

    Basing a story on an extrapolated trend when there is good reason to think the trend won’t continue (e.g. if the car continues to accelerate at the current rate it will break the speed of light within five years).

    12 years ago

  • avatar

    Here’s a couple of mine. Possibly slightly heretical.

    “Talkback radio and others not understanding that their listeners and viewers (audience) are not a random population, however ‘random’ they may be. ”

    I would use the word representative. I think the word random is vague and potentially confusing, and should be used much less than it is.

    Another stats crime that is very common is interpreting p-values as if they were posterior probabilities. It’s very common even among researchers who should know better. It’s quite easy to think of cases where the p-value is less than 0.05 but the posterior probability of the null hypothesis was *increased* by the data.

    12 years ago