Posts filed under Significance (14)

April 4, 2014

Thomas Lumley’s latest Listener column

…”One of the problems in developing drugs is detecting serious side effects. People who need medication tend to be unwell, so it’s hard to find a reliable comparison. That’s why the roughly threefold increase in heart-attack risk among Vioxx users took so long to be detected …”

Read his column, Faulty Powers, here.

November 27, 2013

Interpretive tips for understanding science

From David Spiegelhalter, William Sutherland, and Mark Burgman, twenty (mostly statistical) tips for interpreting scientific findings

To this end, we suggest 20 concepts that should be part of the education of civil servants, politicians, policy advisers and journalists — and anyone else who may have to interact with science or scientists. Politicians with a healthy scepticism of scientific advocates might simply prefer to arm themselves with this critical set of knowledge.

A few of the tips, without their detailed explication:

  • Differences and chance cause variation
  • No measurement is exact
  • Bigger is usually better for sample size
  • Controls are important
  • Beware the base-rate fallacy
  • Feelings influence risk perception
November 19, 2013

Tune in, turn on, drop out?

From online site Games and Learning

A massive study of some 11,000 youngsters in Britain has found that playing video games, even as early as five years old, does not lead to later behavior problems.

This is real research, looking at changes over time in a large number of children and it does find that the associations between ‘screen time’ and later behaviour problems are weak. On the other hand, the research paper concludes

 Watching TV for 3 h or more at 5 years predicted a 0.13 point increase (95% CI 0.03 to 0.24) in conduct problems by 7 years, compared with watching for under an hour, but playing electronic games was not associated with conduct problems.

When you see “was not associated”, you need to look carefully: are they claiming evidence of absence or just weakness of evidence. Here are the estimates in a graphical form, comparing changes in a 10-point questionnaire about conduct.



The data largely rule out average differences as big as half a point, so this study does provide evidence there isn’t a big impact (in the UK). However, it’s pretty clear from the graph that the data don’t provide any real support for a difference between TV and videogames.  The estimates for TV are more precise, and for that reason the TV estimate is ‘statistically significant’ and the videogames one isn’t, but that’s not evidence of difference.

It’s also interesting  that there’s mild support in the data for ‘None’ being worse than a small amount. Here the precision is higher for the videogame estimate, because there are very few children who watch no TV (<2%).

June 27, 2013

Making sense of uncertainty

Sense about Science (a British charity whose name, unusually, is actually accurate) have just launched a publication “Making Sense of Uncertainty”, following their previous guides for the public and journalists that cover screening, medical tests, chemical stories, statistics, and radiation.

Researchers in climate science, disease modelling, epidemiology, weather forecasting and natural hazard prediction say that we should be relieved when scientists describe the uncertainties in their work. It doesn’t necessarily mean that we cannot make decisions – we might well have ‘operational knowledge’ – but it does mean that there is greater confidence about what is known and unknown.
Launching a guide to Making Sense of Uncertainty at the World Conference of Science Journalists today, researchers working in some of the most significant, cutting edge fields say that if policy makers and the public are discouraged by the existence of uncertainty, we miss out on important discussions about the development of new drugs, taking action to mitigate the impact of natural hazards, how to respond to the changing climate and to pandemic threats.
Interrogated with the question ‘But are you certain?’, they say, they have ended up sounding defensive or as though their results are not meaningful. Instead we need to embrace uncertainty, especially when trying to understand more about complex systems, and ask about operational knowledge: ‘What do we need to know to make a decision? And do we know it?’ 

Guide to reporting clinical trials

From the World Conference of Science Journalists, via @roobina (Ruth Francis), ten tweets on reporting clinical trials

  1. Was this #trial registered before it began? If not then check for rigged design, or hidden negative results on similar trials.
  2. Is primary outcome reported in paper the same as primary outcome spec in protocol? If no report maybe deeply flawed.
  3. Look for other trials by co or group, or on treatment, on registries to see if it represents cherry picked finding
  4. ALWAYS mention who funded the trial. Do any of ethics committee people have some interest with the funding company
  5. Will country where work is done benefit? Will drug be available at lower cost? Is disorder or disease a problem there
  6. How many patients were on the trial, and how many were in each arm?
  7. What was being compared (drug vs placebo? Drug vs standard care? Drug with no control arm?
  8. Be precise about people/patient who benefited – advanced disease, a particular form of a disease?
  9. Report natural frequencies: “13 people per 10000 experienced x”, rather than “1.3% of people experienced x”
  10. NO relative risks. Paint findings clearly: improved survival by 3%: BAD. Ppl lived 2 months longer on average: GOOD

Who says you can’t say anything useful in 140 characters?

June 20, 2013

Does success in education rely on having certain genes?

If you have read media stories recently that say ‘yes’, you’d better read this article from the Genetic Literacy Project …

May 7, 2013

Modestly significant

From a comment piece in Stuff, by Bruce Robertson (of Hospitality NZ)

In the past five years, the level of hazardous drinking has significantly decreased for men (from 30 per cent to 26 per cent) and marginally decreased for women (13 per cent to 12 per cent).

There was a modest but important drop in the rates of hazardous drinking among Maori adults, with the rate falling from 33 per cent to 29 per cent in the latest survey.

As @tui_talk pointed out on Twitter, that’s a four percentage point decrease described as “significant” for men and “modest” for Maori.

At first I thought this might be a confusion of “statistically significant” with “significant”, with the decrease in men being statistically significant but the difference in Maori not, but in fact the MoH report being referenced says (p4)

As a percentage of all Māori adults, hazardous drinking patterns significantly decreased from 2006/07 (33%) to 2011/12 (29%). 



April 11, 2013

Power failure threatens neuroscience

A new research paper with the cheeky title “Power failure: why small sample size undermines the reliability of neuroscience” has come out in a neuroscience journal. The basic idea isn’t novel, but it’s one of these statistical points that makes your life more difficult (if more productive) when you understand it.  Small research studies, as everyone knows, are less likely to detect differences between groups.  What is less widely appreciated is that even if a small study sees a difference between groups, it’s more likely not to be real.

The ‘power’ of a statistical test is the probability that you will detect a difference if there really is a difference of the size you are looking for.  If the power is 90%, say, then you are pretty sure to see a difference if there is one, and based on standard statistical techniques, pretty sure not to see a difference if there isn’t one. Either way, the results are informative.

Often you can’t afford to do a study with 90% power given the current funding system. If you do a study with low power, and the difference you are looking for really is there, you still have to be pretty lucky to see it — the data have to, by chance, be more favorable to your hypothesis than they should be.   But if you’re relying on the  data being more favorable to your hypothesis than they should be, you can see a difference even if there isn’t one there.

Combine this with publication bias: if you find what you are looking for, you get enthusiastic and send it off to high-impact research journals.  If you don’t see anything, you won’t be as enthusiastic, and the results might well not be published.  After all, who is going to want to look at a study that couldn’t have found anything, and didn’t.  The result is that we get lots of exciting neuroscience news, often with very pretty pictures, that isn’t true.

The same is true for nutrition: I have a student doing a Honours project looking at replicability (in a large survey database) of the sort of nutrition and health stories that make it to the local papers. So far, as you’d expect, the associations are a lot weaker when you look in a separate data set.

Clinical trials went through this problem a while ago, and while they often have lower power than one would ideally like, there’s at least no way you’re going to run a clinical trial in the modern world without explicitly working out the power.

Other people’s reactions

January 21, 2013

Journalist on science journalism

From Columbia Journalism Review (via Tony Cooper), a good long piece on science journalism by David H. Freedman (whom Google seems to confuse with statistician David A. Freedman)

What is a science journalist’s responsibility to openly question findings from highly credentialed scientists and trusted journals? There can only be one answer: The responsibility is large, and it clearly has been neglected. It’s not nearly enough to include in news reports the few mild qualifications attached to any study (“the study wasn’t large,” “the effect was modest,” “some subjects withdrew from the study partway through it”). Readers ought to be alerted, as a matter of course, to the fact that wrongness is embedded in the entire research system, and that few medical research findings ought to be considered completely reliable, regardless of the type of study, who conducted it, where it was published, or who says it’s a good study.

Worse still, health journalists are taking advantage of the wrongness problem. Presented with a range of conflicting findings for almost any interesting question, reporters are free to pick those that back up their preferred thesis—typically the exciting, controversial idea that their editors are counting on. When a reporter, for whatever reasons, wants to demonstrate that a particular type of diet works better than others—or that diets never work—there is a wealth of studies that will back him or her up, never mind all those other studies that have found exactly the opposite (or the studies can be mentioned, then explained away as “flawed”). For “balance,” just throw in a quote or two from a scientist whose opinion strays a bit from the thesis, then drown those quotes out with supportive quotes and more study findings.

I think the author is unduly negative about medical science — part of the problem is that published claims of associations are expected to have a fairly high false positive rate, and there’s not necessarily anything wrong with that as long as everyone understand the situation.  Lowering the false positive rate would either require much higher sample sizes or a much higher false  negative rate, and the coordination problems needed to get a sample size that will make the error rate low are prohibitive in most settings (with phase III clinical trials and modern genome-wide association studies as two partial exceptions).    It’s still true that most interesting or controversial findings about nutrition are wrong, and that journalists should know they are mostly wrong, and should write as if they know this.   Not reprinting Daily Mail stories would probably help, too.


November 25, 2012

Is family violence getting worse?

Stuff thinks so, but actually it’s hard to say.  The statistics have recently been revised (as the paper complained about in April).

The paper, and the Labor spokewoman, focus on the numbers of deaths in 2008 and 2011: 18 and 27 respectively.

The difference between 18 and 27 isn’t all that statistically significant: a difference that big would happen by chance about 10% of the time even assuming all the deaths are separate cases.  It’s pretty unlikely that the 50% difference reflects a 50% increase in domestic violence, but it might be a sign that there has been some increase. Or not.

The Minister doesn’t do any better: she quotes a different version of the numbers, women killed by their partners (6 in 2008, 14 in 2009, 9 in 2011), as if this was some sort of refutation, and points to targets that just say the government hopes things will improve in the future.

There’s no way that figures for deaths, which are a few tenths hundredths of a percent of all cases investigated by the police, are going to answer either the political fingerpointing question or the real question of how much domestic violence there is, and whether it’s getting better or worse.  It’s obvious why the politicians want to pretend that their favorite numbers are the answer, but there’s no need for journalists to go along with it.