Posts filed under Significance (18)

May 7, 2013

Modestly significant

From a comment piece in Stuff, by Bruce Robertson (of Hospitality NZ)

In the past five years, the level of hazardous drinking has significantly decreased for men (from 30 per cent to 26 per cent) and marginally decreased for women (13 per cent to 12 per cent).

There was a modest but important drop in the rates of hazardous drinking among Maori adults, with the rate falling from 33 per cent to 29 per cent in the latest survey.

As @tui_talk pointed out on Twitter, that’s a four percentage point decrease described as “significant” for men and “modest” for Maori.

At first I thought this might be a confusion of “statistically significant” with “significant”, with the decrease in men being statistically significant but the difference in Maori not, but in fact the MoH report being referenced says (p4)

As a percentage of all Māori adults, hazardous drinking patterns significantly decreased from 2006/07 (33%) to 2011/12 (29%). 



April 11, 2013

Power failure threatens neuroscience

A new research paper with the cheeky title “Power failure: why small sample size undermines the reliability of neuroscience” has come out in a neuroscience journal. The basic idea isn’t novel, but it’s one of these statistical points that makes your life more difficult (if more productive) when you understand it.  Small research studies, as everyone knows, are less likely to detect differences between groups.  What is less widely appreciated is that even if a small study sees a difference between groups, it’s more likely not to be real.

The ‘power’ of a statistical test is the probability that you will detect a difference if there really is a difference of the size you are looking for.  If the power is 90%, say, then you are pretty sure to see a difference if there is one, and based on standard statistical techniques, pretty sure not to see a difference if there isn’t one. Either way, the results are informative.

Often you can’t afford to do a study with 90% power given the current funding system. If you do a study with low power, and the difference you are looking for really is there, you still have to be pretty lucky to see it — the data have to, by chance, be more favorable to your hypothesis than they should be.   But if you’re relying on the  data being more favorable to your hypothesis than they should be, you can see a difference even if there isn’t one there.

Combine this with publication bias: if you find what you are looking for, you get enthusiastic and send it off to high-impact research journals.  If you don’t see anything, you won’t be as enthusiastic, and the results might well not be published.  After all, who is going to want to look at a study that couldn’t have found anything, and didn’t.  The result is that we get lots of exciting neuroscience news, often with very pretty pictures, that isn’t true.

The same is true for nutrition: I have a student doing a Honours project looking at replicability (in a large survey database) of the sort of nutrition and health stories that make it to the local papers. So far, as you’d expect, the associations are a lot weaker when you look in a separate data set.

Clinical trials went through this problem a while ago, and while they often have lower power than one would ideally like, there’s at least no way you’re going to run a clinical trial in the modern world without explicitly working out the power.

Other people’s reactions

January 21, 2013

Journalist on science journalism

From Columbia Journalism Review (via Tony Cooper), a good long piece on science journalism by David H. Freedman (whom Google seems to confuse with statistician David A. Freedman)

What is a science journalist’s responsibility to openly question findings from highly credentialed scientists and trusted journals? There can only be one answer: The responsibility is large, and it clearly has been neglected. It’s not nearly enough to include in news reports the few mild qualifications attached to any study (“the study wasn’t large,” “the effect was modest,” “some subjects withdrew from the study partway through it”). Readers ought to be alerted, as a matter of course, to the fact that wrongness is embedded in the entire research system, and that few medical research findings ought to be considered completely reliable, regardless of the type of study, who conducted it, where it was published, or who says it’s a good study.

Worse still, health journalists are taking advantage of the wrongness problem. Presented with a range of conflicting findings for almost any interesting question, reporters are free to pick those that back up their preferred thesis—typically the exciting, controversial idea that their editors are counting on. When a reporter, for whatever reasons, wants to demonstrate that a particular type of diet works better than others—or that diets never work—there is a wealth of studies that will back him or her up, never mind all those other studies that have found exactly the opposite (or the studies can be mentioned, then explained away as “flawed”). For “balance,” just throw in a quote or two from a scientist whose opinion strays a bit from the thesis, then drown those quotes out with supportive quotes and more study findings.

I think the author is unduly negative about medical science — part of the problem is that published claims of associations are expected to have a fairly high false positive rate, and there’s not necessarily anything wrong with that as long as everyone understand the situation.  Lowering the false positive rate would either require much higher sample sizes or a much higher false  negative rate, and the coordination problems needed to get a sample size that will make the error rate low are prohibitive in most settings (with phase III clinical trials and modern genome-wide association studies as two partial exceptions).    It’s still true that most interesting or controversial findings about nutrition are wrong, and that journalists should know they are mostly wrong, and should write as if they know this.   Not reprinting Daily Mail stories would probably help, too.


November 25, 2012

Is family violence getting worse?

Stuff thinks so, but actually it’s hard to say.  The statistics have recently been revised (as the paper complained about in April).

The paper, and the Labor spokewoman, focus on the numbers of deaths in 2008 and 2011: 18 and 27 respectively.

The difference between 18 and 27 isn’t all that statistically significant: a difference that big would happen by chance about 10% of the time even assuming all the deaths are separate cases.  It’s pretty unlikely that the 50% difference reflects a 50% increase in domestic violence, but it might be a sign that there has been some increase. Or not.

The Minister doesn’t do any better: she quotes a different version of the numbers, women killed by their partners (6 in 2008, 14 in 2009, 9 in 2011), as if this was some sort of refutation, and points to targets that just say the government hopes things will improve in the future.

There’s no way that figures for deaths, which are a few tenths hundredths of a percent of all cases investigated by the police, are going to answer either the political fingerpointing question or the real question of how much domestic violence there is, and whether it’s getting better or worse.  It’s obvious why the politicians want to pretend that their favorite numbers are the answer, but there’s no need for journalists to go along with it.

August 23, 2012

Stat-related startups

At Simply Statistics, a set of stat/data related startups.

One that looks interesting for teaching and for data journalism purposes is Statwing, which is building a web-based pointy-clicky data analysis system, aiming to have good graphics and good text descriptions of the results.  This is the sort of project where the details will matter a lot — poking around at their demo there were a few things I was slightly unhappy about, but nothing devastatingly bad, so there is potential.

February 17, 2012

The right music makes you younger.

Researchers asked 30 University of Pennsylvania undergraduate to listen to a randomly-assigned piece of music, and then to record their birth dates.

According to their birth dates, people were nearly a year-and-a-half younger
after listening to “When I’m Sixty-Four” (adjusted M = 20.1
years) rather than to “Kalimba” (adjusted M = 21.5 years),
F(1, 17) = 4.92, p = .040.

This is a randomized experiment, not just an observational study, so we can conclude that listening to the Beatles actually changes your date of birth.

The point of the paper was to show that various sorts of sloppy design and modestly dodgy reporting of statistical analyses, especially in small data sets, can lead to finding pretty much anything you want.  You can then issue a press release about it and end up in the newspapers.

Some fields of science already know about this problem and have at least attempted to introduce safeguards. Medical researchers and statisticians have put together reporting guidelines that the better medical journals insist on following. The  CONSORT guidelines for randomized trials are pretty widely accepted.  More recently, STROBE addresses observational studies, and  PRISMA (formerly QUOROM) covers systematic reviews.


January 11, 2012

Harold and Kumar in the Lung Lab. (from Reuters) has a reasonably good article about a new research report in the journal JAMA on marijuana, tobacco, and lung function.  It’s also worth pointing out that, as usual, there is more information on line that the newspapers don’t tell you about: the abstract, and a video interview (ie, press release) with the main author.

The research report is from the CARDIA study, which is following up 5000 young adults in four US cities, primarily to look at cardiovascular disease risks.   This paper uses data on tobacco smoking, marijuana smoking, and lung function, and finds a decrease in lung function in people who took up tobacco smoking, but not in most of those who smoked pot.  The study is particularly interesting because the participants were recruited before they started smoking.

The researchers found that lung function of marijuana smokers doesn’t start going down until the cumulative exposure gets up to around 10 joint-years (ie, 1 joint/day for ten years, two/day  for five years, and so on), which is a pretty high level, reached by only a few of the CARDIA participants.  This is contrasted with tobacco, where negative effects start showing up at exposure levels that are quite common.  In fact, and this is whats getting the press, the average lung function is very slightly higher in pot smokers than non-smokers, though not by enough that you would notice without sensitive machinery and thousands of measurements.

Most of this is just that doses of marijuana are much, much lower.  In fact, the negative associations at 10 joint-years cumulative exposure to marijuana are pretty similar to those at 10 pack-years cumulative exposure to tobacco.  That is, the data are consistent with a single joint doing as much short-term lung damage  as a whole pack of cigarettes, and certainly indicate that a joint does more damage than a single cigarette.

The mysterious part is the slightly higher lung function among moderate-level pot smokers. The authors give a number of speculations.  My speculation would be that asthmatics, and other with sensitive airways, are slightly less likely to take up smoking pot.  I think this would actually be testable in the CARDIA data.  In any case, this is careful analysis of pretty high-quality data, done by people with no particular axe to grind, which makes a nice change.


April 7, 2011

This cartoon is good

“So, uh, we did the green study again and got no link. It was probably a– RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!”