February 12, 2015


  • Ways of visualising uncertainty in statistics, from Visualising Data
  • Football competes with internet porn for audience: analysis from Pornhub
    The zero line is ‘average day and time’: a better comparison would have been a typical winter Sunday.
  • The New Yorker, on the problems with so-called precision medicine: The pace of genetics research, the variability of test methods and results, and the aura of infallibility with which the tests are marketed, she told me, make this advance a more complicated one than the EKG.  But, as the demand for DNA testing increases, she says, “it will probably be a bit worse before it gets better.”
  • A panel of the Institute of Medicine has come out with a definition, diagnostic criteria, and a new name for ‘chronic fatigue syndrome’.  The question wasn’t whether people were sick — that’s pretty obvious. The question was which set of people have the same thing wrong with them, and how to tell.  It’s a statistical issue because a definition leads to counting people who satisfy it.
  • It sees you when you’re sleeping; it knows when you’re awake: smart power meters on the front page of the Dominion Post.  (It also sends you lots of email whenever you alter your habits, eg, by travelling).
  • “There’s no plague on the New York subway. No platypuses either”.  Ed Yong on false positives in DNA testing. His team swabbed tomato plants in a field in Virginia, analysed the DNA in those samples, and found matches to the duck-billed platypus—an Australian animal, not known to live in Virginia. They then analysed over 19,000 publicly available microbiome samples from around the world; around a third threw up matches for platypus DNA. Either the platypus secretly rules the world or, more likely, this was a hilarious case of false positives gone mad.
  • NHS Choices makes StatsChat look tactful and friendly: they are going after the newspapers on Twitter
  • “But these headlines are without serious foundation, and through no fault of the journalists.”  David Spiegelhalter on UK coverage of a study of health associations with low-level alcohol consumption.
  • How to release data in a spreadsheet: clean-sheet.org.  Send this to everyone who know who releases data, or just put it on your blog in a passive-aggressive way. The key point is that data release is different from data presentation.

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »