July 4, 2012

Lazy scientific fraud

If you toss a coin 20 times, you will get 10 heads on average.  But if someone claims to have done this experiment 190 times and got exactly 10 heads of out 20 every single time they are either lying or a professional magician.

An anaesthesiology researcher, Yoshitaka Fujii, has the new record for number of papers retracted in scientific journals: 172 and counting. The fakery was uncovered by an analysis of the results of all his published randomized trials, showing that they had an unbelievably good agreement between the treatment and control groups, far better than was consistent with random chance.  For example, here’s the graph of differences in average age between treatment and control groups for Fujii’s trials (on the left) and other people’s trials (on the right), with the red curve indicating the minimum possible variation due only to chance.

The problem was pointed out more than ten years ago, in a letter to one of the journals involved, entitled “Reported data on granisetron and postoperative nausea and vomiting by Fujii et al. are incredibly nice!”  Nothing happened.  Perhaps a follow-up letter should have been titled “When we say ‘incredibly nice’, we mean ‘made-up’, and you need to look into it”.

Last year, Toho University, Fujii’s employer, did an investigation that found eight of the trials had not been approved by an ethics committee (because they hadn’t, you know, actually happened). They didn’t comment on whether the results were reliable.

Finally, the journals got together and gave the universities a deadline to come up with evidence that the trials existed, were approved by an ethics committee, and were reported correctly.  Any papers without this evidence would be retracted.

Statistical analysis to reveal fraud is actually fairly uncommon.  It requires lots of data, and lazy or incompetent fraud: if Prof Fujii had made up individual patient data using random number generators and then analysed it, there would have been no evidence of fraud in the results.   It’s more common  to see misconduct revealed by re-use or photoshopping of images, by failure to get ethics committee approvals, or by whistleblowers.  In some cases where the results are potentially very important, the fraud gets revealed by attempts to replicate the work.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar

    It’s that same effect, I think, the variability in results, that gets me urging people not to take the results of the latest results from a single study that the herald enjoys reporting every so often as gospel.

    I apologise for the awfulness of that sentence.

    12 years ago

  • avatar
    Robyn Gandell

    See “Freakonomics”, or is it “Superfreakonomics”, for a series of student multichoice test answers some of which were made up by teachers. Can you pick the made up answers? Knowing about random chance and probability it’s not hard.

    12 years ago

  • avatar
    Thomas Lumley

    Deborah Nolan, at UC Berkeley, has a class exercise where students either flip a coin 100 times or just make up a 100-letter ‘random’ sequence of H and T. She then glances at each one and says whether it was really random or made up, with almost perfect accuracy. People making up random sequences don’t include long enough runs of the same answer.

    12 years ago

  • avatar
    Clive Robinson

    @ Thomas Lumley,

    “People making up random sequences don’t include long enough runs of the same answer.”

    One of the (many) problems with “True Random Number Generators” is the runs can be to long for certain uses (cryptography and certain types of simulation being just two).

    For instance in theory a high output TRNG could produce a run of 20 to 50 bits of the same value in a reasonably short time frame (or other regular pattern etc). But you might not be happy to use the sequence, or in other circumstances they might be a predictor that your generator is under undue external influence or in the process of failing. Sadly mitigating such issues in real time is not an easy task.

    Sadly though many commercial TRNGs especially the suppliers of embeded ones appear to use this as an excuse to not alow you to “view the raw data”.

    Intel are a bit notorious in this respect and almost always employ the “magic pixie dust” of a suitable one way or hash function…

    As a piece of advice to casual readers “when talking random” it’s probably best to be sure the other person knows “which random” you mean and why… Otherwise you might be “living in a state of sin” and find “One man’s meat, is another man’s poison” with appropriate dire results :-(

    12 years ago