Posts filed under Experiments (24)

August 2, 2014

When in doubt, randomise

The Cochrane Collaboration, the massive global conspiracy to summarise and make available the results of clinical trials, has developed ‘Plain Language Summaries‘ to make the results easier to understand (they hope).

There’s nothing terribly noticeable about a plain-language initiative; they happen all the time.  What is unusual is that the Cochrane Collaboration tested the plain-language summaries in a randomised comparison to the old format. The abstract of their research paper (not, alas, itself a plain-language summary) says

With the new PLS, more participants understood the benefits and harms and quality of evidence (53% vs. 18%, P < 0.001); more answered each of the five questions correctly (P ≤ 0.001 for four questions); and they answered more questions correctly, median 3 (interquartile range [IQR]: 1–4) vs. 1 (IQR: 0–1), P < 0.001). Better understanding was independent of education level. More participants found information in the new PLS reliable, easy to find, easy to understand, and presented in a way that helped make decisions. Overall, participants preferred the new PLS.

That is, it worked. More importantly, they know it worked.

July 24, 2014

Weak evidence but a good story

An example from Stuff, this time

Sah and her colleagues found that this internal clock also affects our ability to behave ethically at different times of day. To make a long research paper short, when we’re tired we tend to fudge things and cut corners.

Sah measured this by finding out the chronotypes of 140 people via a standard self-assessment questionnaire, and then asking them to complete a task in which they rolled dice to win raffle tickets – higher rolls, more tickets.

Participants were randomly assigned to either early morning or late evening sessions. Crucially, the participants self-reported their dice rolls.

You’d expect the dice rolls to average out to around 3.5. So the extent to which a group’s average exceeds this number is a measure of their collective result-fudging.

“Morning people tended to report higher die-roll numbers in the evening than the morning, but evening people tended to report higher numbers in the morning than the evening,” Sah and her co-authors wrote.

The research paper is here.  The Washington Post, where the story was taken from, has a graph of the results, and they match the story. Note that this is one of the very few cases where starting a bar chart at zero is a bad idea. It’s hard to roll zero on a standard die.



The research paper also has a graph of the results, which makes the effect look bigger, but in this case is defensible as 3.5 really is “zero” for the purposes of the effect they are studying



Unfortunately,neither graph has any indication of uncertainty. The evidence of an effect is not negligible, but it is fairly weak (p-value of 0.04 from 142 people). It’s easy to imagine someone might do an experiment like this and not publish it if they didn’t see the effect they expected, and it’s pretty certain that you wouldn’t be reading about the results if they didn’t see the effect they expected, so it makes sense to be a bit skeptical.

The story goes on to say

These findings have pretty big implications for the workplace. For one, they suggest that the one-size-fits-all 9-to-5 schedule is practically an invitation to ethical lapses.

Even assuming that the effect is real and that lying about a die roll in a psychological experiment translates into unethical behaviour in real life, the findings don’t say much about the ‘9-to-5′ schedule. For a start, none of the testing was conducted between 9am and 5pm.


April 25, 2014

Sham vs controlled studies: Thomas Lumley’s latest Listener column

How can a sham medical procedure provide huge benefits? And why do we still do them in a world of randomised, blinded trials? Thomas Lumley explores the issue in his latest New Zealand Listener column. Click here.

April 4, 2014

Thomas Lumley’s latest Listener column

…”One of the problems in developing drugs is detecting serious side effects. People who need medication tend to be unwell, so it’s hard to find a reliable comparison. That’s why the roughly threefold increase in heart-attack risk among Vioxx users took so long to be detected …”

Read his column, Faulty Powers, here.

December 23, 2013

Meet Callum Gray, Statistics Summer Scholar 2013-2014

Every year, the Department of Statistics at the University of Auckland offers summer scholarships to a number of students so they can work with our staff on real-world projects. We’ll be profiling the 2013-2014 summer scholars on Stats Chat. Callum is working with Dr Ian Tuck on a project titled Probability of encountering a bus.  

Callum (right) explains:

“If you encounter a bus on a journey, you are likely to be exposed to higher levels of pollution. I am trying to find the probability of encountering a bus and how many you will encounter when you travel from place A to place B, taking into account variables such as the time of day and mode of transport.


“This research is useful because it will give us more of an understanding about the impact that buses have on our daily exposure to pollution. we can use this information to plan journeys and learn more about an issue that is becoming more and more apparent.

“I was born in Auckland and have lived here my whole life. I just finished my third year of a Bachelor of Commerce/Bachelor of Science conjoint majoring in Accounting, Finance, and Statistics, which I will finish at
the end of 2014.

“Statistics appeals to me because it is used everyday in conjunction with many other areas. It is very useful to know in a lot of workplaces, and it is interesting because it has a lot of real-life applications.

“I am going to Napier for Christmas and Rhythm and Vines for New Year. In the rest of my spare time, I will be playing cricket and golf, as well as hanging out with friends.”



October 8, 2013

100% protection?

The Herald tells us

Sunscreen provides 100 per cent protection against all three types of skin cancer and also safeguards a so-called superhero gene, a new study has found.

That sounds dramatic, and you might wonder how this 100% protection was demonstrated.

The study involved conducting a series of skin biopsies on 57 people before and after UV exposure, with and without sunscreen.

There isn’t any link to the research or even the name of the journal, but the PubMed research database suggests that this might be it, which is confirms by the QUT press release. The researcher name matches, and so does the number of skin biopsies.  They measured various types of cellular change in bits of skin exposed to simulated solar UV light, at twice the dose needed to turn the skin red, and found that sunscreen reduced the changes to less than the margin of error.  This looks like good quality research, and it indicates that sunscreen definitely will give some protection from melanoma, but 100% must be going too far given the small sample and moderate UV dose.

I was a also bit surprised by the “so-called superhero gene”, since I’d never seen p53 described that way before. It’s n0t just me: Google hasn’t seen that nickname either, except on copies of this story.

August 18, 2013

Correlation, genetics, and causation

There’s an interesting piece on cannabis risks at Project Syndicate. One of the things they look at is the correlation between frequent cannabis use and psychosis.  Many people are, quite rightly, unimpressed with the sort of correlation, since it isn’t hard to come up with explanations for psychosis causing cannabis use or for other factors causing both.

However, there is also some genetic data.  The added risk of psychosis seems to be confined to people with two copies of a particular genetic variant in a gene called AKT1. This is harder to explain as confounding (assuming the genetics has been done right), and is one of the things genetics is useful for. This isn’t just a one-off finding; it was found in one study and replicated in another.

On the other hand, the gene AKT1 doesn’t seem to be very active in brain cells, making it more likely that the finding is just a coincidence.  This is one of the things bioinformatics is good for.

In times like these it’s good to remember Ben Goldacre’s slogan “I think you’ll find it’s a bit more complicated than that.”

June 4, 2013

Survey respondents are lying, not ignorant

At least, that’s the conclusion of a new paper from the National Bureau of Economic Research.

It’s a common observation that some survey responses, if taken seriously, imply many partisans are dumber than a sack of hammers.  My favorite example is the 32% of respondents who said the Gulf of Mexico oil well explosion made them more likely to support off-shore oil drilling.

As Dylan Matthews writes in the Washington Post, though, the research suggests people do know better. Ordinarily they give the approved politically-correct answer for their party

In the control group, the authors find what Bartels, Nyhan and Reifler found: There are big partisan gaps in the accuracy of responses. …. For example, Republicans were likelier than Democrats to correctly state that U.S. casualties in Iraq fell from 2007 to 2008, and Democrats were likelier than Republicans to correctly state that unemployment and inflation rose under Bush’s presidency.

But in an experimental group where correct answers increased your chance of winning a prize, the accuracy improved markedly:

Take unemployment: Without any money involved, Democrats’ estimates of the change in unemployment under Bush were about 0.9 points higher than Republicans’ estimates. But when correct answers were rewarded, that gap shrank to 0.4 points. When correct answers and “don’t knows” were rewarded, it shrank to 0.2 points.

This is probably good news for journalism and for democracy.  It’s not such good news for statisticians.

April 1, 2013


Despite the date, this is not in any way an April Fools post

  • “Data is not killing creativity, it’s just changing how we tell stories”, from Techcrunch
  • Turning free-form text into journalism: Jacob Harris writes about an investigation into food recalls (nested HTML tables are not an open data format either)
  • Green labels look healthier than red labels, from the Washington Post. When I see this sort of research I imagine the marketing experts thinking “how cute, they figured that one out after only four years”
  • Frances Woolley debunks the recent stories about how Facebook likes reveal your sexual orientation (with comments from me).  It’s amazing how little you get from the quoted 88% accuracy, even if you pretend the input data are meaningful.  There are some measures of accuracy that you shouldn’t be allowed to use in press releases.
March 22, 2013


  • A post at Scientific American about covering clinical trials, for journalists and readers.  It’s a summary from the Association of Health Care Journalists annual conference. Starts out “My message: Ask the hard questions.”
  • Asking the hard questions is also useful in covering surveys.  Stuff reports “Kiwi leaders amongst the world’s riskiest”,
  • New Zealand leaders are among the most likely in the world to ignore data and fail to seek a range of opinions when making decisions

    with no provenance except that this was based on a 600,000 person survey of managers and professionals by SHL.  Before trying to track down any more detail, just think: how could this have worked? How would you get reliable information to support those conclusions from each of 600,000 people? 

  • You may have heard about the famous Hawthorne experiment, where raising light levels in a factory improved output, as did lowering them, as did anything else experimental. The original data have been found and this turns out not to be the case.