Posts filed under Research (126)

August 30, 2014

Funding vs disease burden: two graphics

You have probably seen the graphic from vox.comhyU8ohq


There are several things wrong with it. From a graphics point of view it doesn’t make any of the relevant comparisons easy. The diameter of the circle is proportional to the deaths or money, exaggerating the differences. And the donation data are basically wrong — the original story tries to make it clear that these are particular events, not all donations for a disease, but it’s the graph that is quoted.

For example, the graph lists $54 million for heart disease, based on the ‘Jump Rope for Heart’ fundraiser. According to Forbes magazine’s list of top charities, the American Heart Association actually received $511 million in private donations in the year to June 2012, almost ten times as much.  Almost as much again came in grants for heart disease research from the National Institutes of Health.

There’s another graph I’ve seen on Twitter, which shows what could have been done to make the comparisons clearer:



It’s limited, because it only shows government funding, not private charity, but it shows the relationship between funding and the aggregate loss of health and life for a wide range of diseases.

There are a few outliers, and some of them are for interesting reasons. Tuberculosis is not currently a major health problem in the US, but it is in other countries, and there’s a real risk that it could spread to the US.  AIDS is highly funded partly because of successful lobbying, partly because it — like TB — is a foreign-aid issue, and partly because it has been scientifically rewarding and interesting. COPD and lung cancer are going to become much less common in the future, as the victims of the century-long smoking epidemic die off.

Depression and injuries, though?


Update: here’s how distorted the areas are: the purple number is about 4.2 times the blue number


August 28, 2014

Age, period, um, cohort

A recurring issue with trends over time is whether they are ‘age’ trends, ‘period’ trends, or ‘cohort’ trends.  That is, when we complain about ‘kids these days’, is it ‘kids’ or ‘these days’ that’s the problem? Mark Liberman at Language Log has a nice example using analyses by Joe Fruehwald.


If you look at the frequency of “um” in speech (in this case in Philadelphia), it decreases with age at any given year



On the other hand, it increases over time for people in a given age cohort (for example, the line that stretches right across the graph is for people born in the 1950s)



It’s not that people say “um” less as they get older, it’s that people born a long time ago say “um” less than people born recently.

August 15, 2014

Cancer statistics done right

I’ve mentioned a number of times that statistics on cancer survival are often unreliable for the conclusion people want to draw, and that you need to look at cancer mortality.  Today’s story in Stuff is about Otago research that does it right:

The report found for 11-year timeframe, cancer-specific death rates decreased in both countries and cancer mortality fell in both countries. But there was no change in the difference between the death rates New Zealand and Australia, which remained remained 10 per cent higher in New Zealand.

That is, they didn’t look at survival after diagnosis, they looked at the rate of deaths. They also looked at the rate of cancer diagnoses

“The higher mortality from all cancers combined cannot be attributed to higher incidence rates, and this suggests that overall patient survival is lower in New Zealand,” Skegg said.

That’s not quite as solid a conclusion — it’s conceivable that New Zealand really has higher incidence, but Australia compensates by over-diagnosing tumours that wouldn’t ever cause a problem — but it would be a stretch to have that happen over all types of cancer combined, as they observed.


July 28, 2014

Rise of the machines



The Automatic Statistician project (somewhat flaky website) is working to automate various types of statistical modelling. They have interesting research papers. They also have a demo that’s fairly limited but produces linear regression models, model checks, and descriptions that are reasonable from a predictive point of view.

Automating some bits of data analysis is an important problem, because there aren’t enough statisticians to go around. However (as Cathy O’Neill points out about competition sites like Kaggle), they aren’t tackling the hard bits of data analysis: getting the data ready, and more importantly, getting the question into a precisely-specified form that can be answered by fitting a model.

July 23, 2014

Human statisticians not obsolete

There’s a website,, that, as it says

Discovers New Insights from Data.
Writes Them Up in Perfect English.
All Automated.

You can test this by asking it for ‘insights’ in some example areas. One area is baseball, so naturally I selected the Seattle Mariners, and 2009, when I still lived in Seattle. OnlyBoth returns several names where it found insights, and I chose ‘Matt Tuiasosopo’ — the most obvious thing about him is that he comes from a famous local football family, but I was interested in what new insight the data revealed.

Matt Tuiasosopo in 2009 was the 2nd-youngest (23 yrs) of the 25 hitters who were born in Washington and played for the Seattle Mariners.

outdone by Matt Tuiasosopo in 2008 (22 yrs).

I don’t think our students need to be too worried yet.

July 13, 2014

Age/period/cohort voting

From the New York Times, an interactive graph showing how political leanings at different ages have changed over time


Yes, voting preferences for kids are problematic. Read the story (and this link) to find out how they inferred them. There’s more at Andrew Gelman’s blog.

July 1, 2014

Facebook recap

The discussion over the Facebook experiment seems to involve a lot of people being honestly surprised that other people feel differently.

One interesting correlation based on my Twitter feed is that scientists involved in human subjects research were disturbed by the research and those not involved in human subjects research were not. This suggests our indoctrination in research ethics has some impact, but doesn’t answer the question of who is right.

Some links that cover most of the issues

June 29, 2014

Not yet news

When you read “The university did not reveal how the study was carried out” in a news story about a research article, you’d expect the story to be covering some sort of scandal. Not this time.

The Herald story  is about broccoli and asthma

They say eating up to two cups of lightly steamed broccoli a day can help clear the airways, prevent deterioration in the condition and even reduce or reverse lung damage.

Other vegetables with the same effect include kale, cabbage, brussels sprouts, cauliflower and bok choy.

Using broccoli to treat asthma may also help for people who don’t respond to traditional treatment.

‘How the study was carried out’ isn’t just a matter of detail: if they just gave people broccoli, they wouldn’t know what other vegetables had the same effect, so maybe it wasn’t broccoli but some sort of extract? Was it even experimental or just observational? And did they actually test people who don’t respond to traditional treatment? And what exactly does that mean — failing to respond is pretty rare, though failing to get good control of asthma attacks isn’t.

The Daily Mail story was actually more informative (and that’s not a sentence I like to find myself writing). They reported a claim that wasn’t in the press release

The finding due to sulforaphane naturally occurring in broccoli and other cruciferous vegetables, which may help protect against respiratory inflammation that can cause asthma.

Even then, it isn’t clear whether the research really found that sulforaphane was responsible, or whether that’s just their theory about why broccoli is effective. 

My guess is that the point of the press release is the last sentence

Ms Mazarakis will be presenting the research findings at the 2014 Undergraduate Research Conference about Food Safety in Shanghai, China.

That’s a reasonable basis for a press release, and potentially for a story if you’re in Melbourne. The rest isn’t. It’s not science until they tell you what they did.

Ask first

Via The Atlantic, there’s a new paper in PNAS (open access) that I’m sure is going to be a widely cited example by people teaching research ethics, and not in a good way:

 In an experiment with people who use Facebook, we test whether emotional contagion occurs outside of in-person interaction between individuals by reducing the amount of emotional content in the News Feed. When positive expressions were reduced, people produced fewer positive posts and more negative posts; when negative expressions were reduced, the opposite pattern occurred. These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.

More than 650,000 people had their Facebook feeds meddled with in this way, and as that paragraph from the abstract makes clear, it made a difference.

The problem is consent.  There is a clear ethical principle that experiments on humans require consent, except in a few specific situations, and that the consent has to be specific and informed. It’s not that uncommon in psychological experiments for some details of the experiment to be kept hidden to avoid bias, but participants still should be given a clear idea of possible risks and benefits and a general idea of what’s going on. Even in medical research, where clinical trials are comparing two real treatments for which the best choice isn’t known, there are very few exceptions to consent (I’ve written about some of them elsewhere).

The need for consent is especially clear in cases where the research is expected to cause harm. In this example, the Facebook researchers expected in advance that their intervention would have real effects on people’s emotions; that it would do actual harm, even if the harm was (hopefully) minor and transient.

Facebook had its research reviewed by an Institutional Review Board (the US equivalent of our Ethics Committees), and the terms of service say they can use your data for research purposes, so they are probably within the law.  The psychologist who edited the study for PNAS said

“I was concerned,” Fiske told The Atlantic, “until I queried the authors and they said their local institutional review board had approved it—and apparently on the grounds that Facebook apparently manipulates people’s News Feeds all the time.”

Fiske added that she didn’t want the “the originality of the research” to be lost, but called the experiment “an open ethical question.”

To me, the only open ethical question is whether people believed their agreement to the Facebook Terms of Service allowed this sort of thing. This could be settled empirically, by a suitably-designed survey. I’m betting the answer is “No.” Or, quite likely, “Hell, no!”.

[Update: Story in the Herald]

June 3, 2014

Are girl hurricanes less scary?

There’s a new paper out in the journal PNAS claiming that hurricanes with female names cause three times as many deaths as those with male names (because people don’t give girl hurricanes the proper respect). Ed Yong does a good job of explaining why this is probably bogus, but no-one seems to have drawn any graphs, which I think make the situation a lot clearer. (more…)