Posts written by Thomas Lumley (1485)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

May 26, 2015

Who is my neighbour?

The Herald has a story with data from the General Social Survey. Respondents were asked if they would feel comfortable with a neighbour who was from a religious minority, LGBT, from an ethnic or racial minority, with mental illness, or a new migrant.  The point of the story was that the figure was about 50% for mental illness, compared to about 75% for the other groups. It’s a good story; you can go read it.

What I want to do here is look at how the 75% varies across the population, using the detailed tables that StatsNZ provides. Trends across time would have been most interesting, but this question is new, so we can’t get them. As a surrogate for time trends, I first looked at age groups, with these results [as usual, click to embiggen]


There’s remarkably little variation by age: just a slight downturn for LGBT acceptance in the oldest group. I had expected an initial increase then a decrease: a combination of a real age effect due to teenagers growing up, then a cohort effect where people born a long time ago have old-fashioned version. I’d also expected more difference between the four questions over age group.

After that, I wasn’t sure what to expect looking at the data by region. Again, there’s relatively little variation.


For gender and education at least the expected relationships held: women and men were fairly similar except that men were less comfortable with LGBT neighbours, and comfort went up with education.


Dividing people up by ethnicity and migrant status was a mixture of expected and surprising. It’s not a surprise that migrants are happier with migrants as neighbours, or, since they are more likely to be members of religious minorities, that they are more comfortable with them. I was expecting migrants and people of Pacific or Asian ethnicity to be less comfortable with LGBT neighbours, and they were. I wasn’t expecting Pacific people to be the least comfortable with neighbours from an ethnic or racial minority.


As always with this sort of data it’s important to remember these responses aren’t really level of comfort with different types of neighbours. They aren’t even really what people think their level of comfort would be with different types of neighbours, just whether they say they would be comfortable. The similarity across the four questions makes me suspect there’s a lot of social conformity bias creeping in.

May 25, 2015

Cancer vaccine?

The segment was about research from the Malaghan Institute, who are working on ways to encourage a patient’s immune system to attack tumours. They say, in a press release

While the research will focus specifically on targeting melanoma, it is anticipated that the methodology being developed could be applied to other cancers in the future.

The therapeutic vaccine approach differs from the preventative vaccines used to protect against diseases such as measles or the flu because the cancer vaccine is designed to be given to an individual after they have already shown signs of disease.

“It is known that white blood cells called T cells can kill tumour cells,” says Dr Hermans. “The cancer vaccines, which are custom-made for each cancer patient, are designed to stimulate the activity of these cancer-fighting immune cells.”

As the press release makes clear, the term ‘vaccine’ is technically correct, but liable to mislead: these are customised immune-system treatments specific to one tumour in one individual. It’s nothing like the measles vaccine that you get as an infant for lifetime protection.

Like other research groups, the Malaghan Institute are starting with melanoma. There are at least two reasons melanoma is a good place to start. The simple reason: until very recently, metastatic melanoma was completely untreatable, so anything would be an improvement. There’s also a complex reason: melanoma occasionally shrinks or vanishes of its own accord, apparently more often than other tumours do. The spontaneous regressions are presumably thanks to the immune system waking up and realising the tumour is a problem, so melanoma is a good starting point if you want to find out how this happens and encourage it to happen more often.

The basic problem is that the immune system tends to see cancer cells as part of the patient, since, fundamentally, they are. The Malaghan Institute has a innovative addition to the treatment, a chemical that scares specific parts of the immune system into action. They expect that combining this with the existing tumour vaccine approaches will give a more reliable result.

Malaghan got $4.5 million from the Health Research Council to work on this, which is pretty impressive given the HRC budget and competition, but the melanoma vaccine is still in the initial stages of testing. The majority of promising treatments going in to Phase I clinical trials don’t end up being useful. Even if this one does, that’s no guarantee it will work for other types of cancer, and while it’s a vaccine in the sense that it works by stimulating an immune reaction, it’s nothing like the vaccines we give to kids.

It’s not a good time to criticise Campbell Live, but although the Malaghan’s research is truly impressive, it’s not a general-purpose cure for cancer anytime this decade. And there’s a basic principle that you shouldn’t say “cure” in the headline unless there’s a cure.

Genetic determinism: infidelity edition

New York Times columnist Richard Friedman is writing about hormones, genetics, and infidelity.  This paragraph is about recently-published research by Brendan Zietsch and colleagues (the NYT tries to link, but the URL is wrong)

His study, published last year in Evolution and Human Behavior, found a significant association between five different variants of the vasopressin gene and infidelity in women only and no relationship between the oxytocin genes and sexual behavior for either sex. That was impressive: Forty percent of the variation in promiscuous behavior in women could be attributed to genes.

If you didn’t read carefully you might think this was a claim that the  vasopressin gene association explained the “Forty percent” and that the percentage was lower in men. In fact, the vasopressin gene associations are rather weaker than that, and the variation attributed by the researchers to genes is 62% in men.

But it gets worse. The correlation with genetics was only seen in identical twins. That is, pairs of identical twins had fairly similar cheating behaviour , but there was no similarity at all between pairs of non-identical twins (of any gender combination) or between non-twin siblings.  If that’s not due to chance (which it could be), it’s very surprising. It doesn’t rule out a genetic explanation — but it means the genetics would have to be weird.  You’d need either a variant that had opposite effects with one versus two copies, or a lot of variants that only had effects with two copies and no effect with one, or an effect that switched on only when you had variant copies of multiple genes, or an effect driven by new mutations not inherited from parents.  The results for the vasopressin gene don’t have this kind of weird.

The story is all “yes, it’s surprising that you’d get this sort of effect in a complex social behaviour, but genetics! And voles!”. I’ll give him the voles, but if anything, the strong correlation between identical twins (only) argues against vasopressin gene variants being a major driver in humans, and the research paper is much more cautious on this point.



May 24, 2015


  • Flickr has an automatic photo-tagging algorithm. Users don’t like it, because tagging is their business. And also because it “labelled images of black people with tags such as “ape” and “animal” as well as tagging pictures of concentration camps with “sport” or “jungle gym.”
  • A basic medical-reporting rule is that you don’t say “cure” in the headline if it isn’t a cure. The recent study of mutations in prostate cancer, reported in the Herald (from the Telegraph), is headlined “Breakthrough offers hope for prostate cancer cure“. About two-thirds of the tumours “had mutations in a molecule that interacts with the male hormone androgen which can already be targeted by current drugs.”  That’s true, but it’s not only “can already be targeted”, but also “is already targeted as standard practice“. Many of other commonly-found mutations are in genes like in BRCA1 and BRCA2, where we don’t have anything like curative treatment. In the medium term, yes, this could be very useful, but the headline is over the top.
May 23, 2015

Data-driven journalism at Canon Media Awards

I had the chance to attend the Canon Media Awards Night, as a guest of the Science Media Centre (who are one of the sponsors).

It was a good year for data journalism.  Harkanwal Singh and his team won “Best use of interactive graphics” and “Best multimedia storytelling” for projects based on effective communication of publicly-available data.

Perhaps more importantly for the future, the citation for the Herald’s “Best digital cross-platform news coverage”  explicitly called out the integration of data:

“The combination of exclusive breaking stories, data journalism, use of the digital media platforms and social coverage, meant the user’s experience was both exciting and broad.”

with similar comments in the citation for best website.

Bloggers can do data analysis and visualisation. What the professional media can do that we usually can’t is combine this with traditional reporting — stories of individual experience, or detailed investigation of who is hiding what and why.

For the consumer of traditional journalism, data literacy gives context — is this the tip of the iceberg or just the tip of the icecube? Interactive, visual data publishing adds the opportunity for readers to explore further and have deeper engagement with the story.

May 22, 2015

Budget viz

Aaron Schiff has collected visualisations of the overall NZ 2015 budget

A useful one that no-one’s done yet would be something showing how the $25 benefit increase works out with other benefits being considered as income — either in terms of the distribution of net benefit increases or in terms of effective marginal tax rate.

May 21, 2015

Fake data in important political-science experiment

Last year, a research paper came out in Science demonstrating an astonishingly successful strategy for gaining support for marriage equality: a short, face-to-face personal conversation with a gay person affected by the issue. As the abstract of the paper said

Can a single conversation change minds on divisive social issues, such as same-sex marriage? A randomized placebo-controlled trial assessed whether gay (n = 22) or straight (n = 19) messengers were effective at encouraging voters (n = 972) to support same-sex marriage and whether attitude change persisted and spread to others in voters’ social networks. The results, measured by an unrelated panel survey, show that both gay and straight canvassers produced large effects initially, but only gay canvassers’ effects persisted in 3-week, 6-week, and 9-month follow-ups. We also find strong evidence of within-household transmission of opinion change, but only in the wake of conversations with gay canvassers. Contact with gay canvassers further caused substantial change in the ratings of gay men and lesbians more generally. These large, persistent, and contagious effects were confirmed by a follow-up experiment. Contact with minorities coupled with discussion of issues pertinent to them is capable of producing a cascade of opinion change.

Today, the research paper is going away again. It looks as though the study wasn’t actually done. The conversations were done: the radio program “This American Life” gave a moving report on them. The survey of the effect, apparently not so much. The firm who were supposed to have done the survey deny it, the organisations supposed to have funded it deny it, the raw data were ‘accidentally deleted’.

This was all brought to light by a group of graduate students who wanted to do a similar experiment themselves. When they looked at the reported data, it looked strange in a lot of ways (PDF). It was of better quality than you’d expect: good response rates, very similar measurements across two cities,  extremely good before-after consistency in the control group. Further investigation showed before-after changes fitting astonishingly well to a Normal distribution, even for an attitude measurement that started off with a huge spike at exactly 50 out of 100. They contacted the senior author on the paper, an eminent and respectable political scientist. He agreed it looked strange, and on further investigation asked for the paper to be retracted. The other author, Michael LaCour, is still denying any fraud and says he plans to present a comprehensive response.

Fake data that matters outside the world of scholarship is more familiar in medicine. A faked clinical trial by Werner Bezwoda led many women to be subjected to ineffective, extremely-high-dose chemotherapy. Scott Reuben invented all the best supporting data for a new approach to pain management; a review paper in the aftermath was titled “Perioperative analgesia: what do we still know?”  Michael LaCour’s contribution, as Kieran Healy describes, is that his approach to reducing prejudice has been used in the Ireland marriage equality campaign. Their referendum is on Friday.


  • Sometimes people with an axe to grind are right. In this case the people who cast aspersions on the leading data-based research opposing same-sex parenting. The closer they look at the data, the less convincing it is. (NY Magazine, new research paper) “The reanalysis illustrates the importance of methodological decisions in research”
  • A study of spreadsheets in their natural habitat: blog post, paper by Felienne Harmans
  • Facetted barcharts and fluctuation diagrams, from Di Cook. The data describes the responses of couples on questions about their sex life.
  • Looking at scientists giving advice on politically controversial topics: a case study of badger culling in the UK by Helen Briggs. “The badgers moved the goalposts”
  • From the New Zealand conference ‘Going Public,’ on the same topic, a post by Dr SM Morgan , who works on health literacy. “Find something complementary to say about a scientific colleague’s scicomm efforts and imagine saying it out loud to their face.”
May 20, 2015

Actually it’s about neuroscience in videogame journalism

Q: Why does Stuff think playing ‘Call of Duty’ increases the risk of Alzheimer’s disease? I didn’t think old people played violent video games much.

A: I don’t think Stuff does think anything in particular about it. They just reprinted that from the Telegraph and trimmed out the casual sexism.

Q: Ok, why does the Telegraph think playing ‘Call of Duty’ increases the risk of Alzheimer’s disease? I didn’t think old people played violent video games much.

A: It’s not so much about the games they play now as the ones they played 60 years earlier

Q: What video games did they play 60 years ago?

A: Not current people with Alzheimer’s; gamers now who might get Alzheimer’s in a few decades.

Q: Ok, so why will that happen?

A: Because video game players use response learning strategies for navigation in video mazes

Q: Why is that bad?

A: Because other research found people who used those strategies had more activity in their caudate nucleus.

Q: Is this going to start making sense soon?

A:. Yes. Sorry. The research found that when navigating a virtual-reality maze habitual game players used strategies that had previously been correlated with less activity in a part of the brain involved in memory and spatial awareness than normal people did. They apparently used different strategies involving other parts of the brain.

Q: And why is this a problem?

A: Because that part of the brain, the hippocampus, is less active in people with Alzheimer’s, as well as some other neurological and psychological disorders

Q: While they’re playing video games?

A: No, all the time.

Q: Couldn’t it just be that the video gamers have developed a more efficient strategy and that their hippocampuses are perfectly fine. Or hippocampi, whatever?

A: Yes, that could also be the case.

Q: I mean, if you saw people twitching their thumbs rapidly playing a video game it would be fine, but if they were just doing that while sitting around at meetings you’d worry a bit.

A: Indeed.

Q: Did the research look at memory or cognition at all?

A: No.

Q: Do they even know that these brain differences happened after playing video games? Could it be that people who don’t use that part of the brain for video navigation are just better at games?

A: It could be, yes.

Q: The story quotes the percentages using the parts of their brain to four significant digits. Does that mean there were tens of thousands of people in the research?

A: No.

Q: How many?

A: 59: about 30 in each group

Q: If this was true, could it explain why dementia is increasingly common?

A: No.

Q: Why not?

A: Partly because it’s too soon, and partly because dementia isn’t increasingly common at a given age, at least in the US and Europe. If anything, it’s less common. There are more cases now because there are more old people.

Q: It sounds like more research might be needed before writing international headlines about the risk of a terrifying disease.

A: You think?

Weather uncertainty

From the MetService warnings page


The ‘confidence’ levels are given numerically on the webpage as 1 in 5 for ‘Low’, 2 in 5 for ‘Moderate’ and 3 in 5 for ‘High’. I don’t know how well calibrated these are, but it’s a sensible way of indicating uncertainty.  I think the hand-drawn look of the map also helps emphasise the imprecision of forecasts.

(via Cate Macinnis-Ng on Twitter)