Posts written by Thomas Lumley (1874)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

October 17, 2016


  • Beautiful weather maps from Ventusky, via Jenny Bryan
  • From BusinessInsider: 90% of executive board members think the ideal proportion of women on boards is higher than the current 20%, but the majority think it should still be 40% or less.
  • The Ministry for Social Development is collecting more data on people who use government-support community services. On one hand, they’re less likely to misuse it than a lot of internet companies; on the other hand, it might well deter people from seeking help. And while the Ministry is getting written consent, the people obtaining it won’t get paid by the Ministry if consent isn’t given.
  • If you only read one summary of the state of the US elections, the 538 update is a relatively painless and informative one.
  • People might be worrying too much about hackers (techy)

Moreover, we find that cyber incidents cost firms only a 0.4% of their annual revenues, much lower than retail shrinkage (1.3%), online fraud (0.9%), and overall rates of corruption, financial misstatements, and billing fraud (5%).


“Kind of” being an important qualifier here.

Vote takahē for Bird of the Year

It’s time again for the only bogus poll that StatsChat endorses: the New Zealand Bird of the Year.

Why is Bird of the Year ok?

  • No-one pretends the result means anything real about popularity
  • The point of the poll is just publicity for the issue of bird conservation
  • Even so, it’s more work to cheat than for most bogus polls


Why takahē?

  • Endangered
  • Beautiful (if dumb)
  • Very endangered
  • Unusual even by NZ bird standards: most of their relatives (the rail family) are shy little waterbirds.


(A sora, a more-typical takahē relative, by/with ecologist Auriel ‘@RallidaeRule’ Fournier)

October 13, 2016

Weighting surveys

From the New York Times: “How One 19-Year-Old Illinois Man Is Distorting National Polling Averages”

There is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election.

He is sure he is going to vote for Donald J. Trump.

I think the story exaggerates the impact of this guy’s opinions on polling averages, but it’s a great illustration of one of the subtleties of polling.

Even in New Zealand, you often see people claiming, for example, that opinion polls will underestimate the Green Party vote because Green voters are younger and more urban, and so are less likely to have landline phones. As we see from the actual elections, that isn’t true. Pollers know about these simple forms of bias, and use weighting to fix them — if they poll half as many young voters as they should, each of their votes counts twice. Weighting isn’t as good as actually having a representative sample, but it’s ok — and unlike actually having a representative sample, it’s achievable.

One of the tricky parts of weighting is which groups to weight. If you make the groups too broadly-defined, you don’t remove enough bias; if you make them too narrowly-defined, you end up with a few people getting really extreme weights, making the sampling error much larger than it should be. That’s what happened here: the survey had one person in one of its groups, and that person turned out to be unusual. But it gets worse.

The impact of the weighting was amplified because this is a panel survey, polling the same people repeatedly. Panel surveys are useful because they allow much more accurate estimation of changes in opinions, but an unlucky sample will persist over many surveys.

Worse still, one of the weighting factors used was how people say they voted in 2012. That sounds sensible, but it breaks one of the key assumptions about weighting variables: you need to know the population totals.  We know the totals for how the population really voted in 2012, but reported vote isn’t the same thing at all — people are surprisingly unreliable at reporting how they voted in the past.

The actual impact on polling aggregators such as 538 is probably pretty small, since they model and try to remove ‘house effects’ (differences between surveys). However, the poll does give aid and comfort to people who don’t want to believe the consensus results, and that is not helpful.

October 11, 2016


  • A curriculum to help kids think critically about health claims has been developed — and is being evaluated in a randomised trial in Uganda (from Vox)
  • Someone else (the website Grub Street) has fallen for the cheese addiction hoax. I wrote here about how the story makes no sense.  There’s a post by SciCurious that includes an interview with one of the people behind the actual research, talking about how the story just isn’t supported by her work. We still don’t seem to know who is pushing the hoax version.
  • I was on RadioNZ’s Our Changing World, talking to Allison Ballance about means and medians
  • Using mathematics (or statistics) to help with art repair:  Ingrid Daubechies talks about her work.
  • From MBIE, an interactive map of NZ tourist numbers

This has been an urban legend in the UK — it’s true in Melbourne, though mostly because the Mt Waverley reservoir is a small storage buffer rather than main storage

October 6, 2016

Coffee news

Q: Did you see that two cups of coffee a day will prevent dementia?

A: In Stuff? That’s not what it says.

Q: “Two cups of coffee a day can keep dementia at bay – research

A: Read a bit further.

Q: Ok, so it’s just in women over 65, and two to three cups, and it’s a 36% reduction,  not as good as the headline says, but still pretty good, surely?

A: There’s a lot of uncertainty in that number

Q: So what’s the margin of error or whatever the medical folks call it?

A: According to the research paper a 95% confidence interval for the reduction goes from 1% to 44%. And it’s a reduction in rate, not in risk — it could easily be postponing rather than preventing dementia, even if it works.

Q: Was there a link to the paper?

A: No, but there was a link to the press release, and it linked to the paper.

Q: That interval. Why isn’t 36% in the middle of the interval?

A: I don’t know. The results in the abstract and tables of the paper give a hazard ratio of 0.74. I can think of two possibilities.  One is that the 36% isn’t based on the primary findings in the abstract but on a less well-described secondary analysis. The other is that someone subtracted 74% from 100% and got it wrong.

Q: Why is it just women over 65?

A: Because that’s who was in the study.

Q: So the coffee-drinking didn’t necessarily start at 65?

A: No, and it wasn’t necessarily coffee. It could have been tea or soda.

Q: Could they look at whether the coffee drinkers were different at the start of the study?

A: Yes — and they were.  The difference in their cognitive test scores stayed pretty much constant during the study, and the correlation with caffeine mostly goes away if you compare people starting out with the same test scores.

Q: So it might be that caffeine matters at an earlier age, not over 65?

A: And it might not matter — perhaps the people who drink a lot of caffeine were at lower risk for some other reason.

Q: Could it still be true?

A: It could.  It is in some lab-animal models of Alzheimer’s, but no-one really knows how relevant they are to human dementia

Q: Rats.

A: Yes, and mice.

Q: No, that was a colloquial exclamation expressing frustration, disappointment, or annoyance.

October 4, 2016

Depression and the pill

There’s a recent paper from Denmark finding that women, particularly young women, who used hormonal contraceptives were more likely also to be diagnosed with  depression.  The Guardian has a sensible story reporting on the paper (though given the topic it’s a pity the external experts they talked to were both men). There’s also an opinion piece, which conveys the importance of the issue but is clearly written by someone whose opinions were decided before the research came out. I was asked on Twitter what I thought.

One of the more difficult cases for science communication is where the evidence is neither negligible nor overwhelming, and that’s the situation here.  There’s nothing intrinsically unlikely about an effect on depression, and there are some ways that this study is very good, but there are also some limitations to the data that make the evidence weaker.

First, the good points. The study involved the entire Danish population over nearly 20 years, meaning that it was large enough to be fairly reliable on whether correlations are present or not, and also that it was comprehensive — it didn’t miss people out.  The data on who used hormonal contraceptives comes from the national health system and so should be accurate. The two definitions of depression — ‘prescribed anti-depressant drugs’ and ‘psychiatrist diagnosis of depression’ — will be measured reliably, and the decisions will have been taken by people who don’t have any particular view on the study question.  There’s information on timing, so we know the contraceptives were used before the depression. The associations are strong enough to care about, but not so strong as to be implausible. The analysis is well done given the data.

However, there are at least two alternative explanations for the correlation that aren’t ruled out by these data. The first is that the depression definitions require seeing a doctor and asking for (or at least accepting) treatment, and women who take hormonal contraceptives are probably more likely to see a doctor regularly.  The second explanation, which the researchers do consider, is that break-ups of relationships are a cause of depression, especially in younger people, and being in these relationships might be related to using hormonal contraceptives.  The researchers don’t believe this explanation, and they may be right, but their data don’t rule it out.

It’s not that either of these explanations is necessarily more likely than a direct effect of hormones, but if there weren’t alternative explanations the evidence would be stronger.  For example, if the researchers had been able to compare women using hormonal contraceptives just to those using non-hormonal contraceptives (eg copper IUDs and condoms), and had still seen the same correlation, the second explanation would be much less plausible and the evidence for a direct effect would be more convincing.

Also, if there were a straightforward hormonal explanation I would have expected different types of contraceptive to have stronger or weaker associations according to the dose of, say, progestins. In fact what they saw is that less commonly used contraceptives had stronger associations: weakest for the combined pills, stronger for progestin-only ‘mini-pills’ and stronger still for patch and implant methods. Again, this certainly doesn’t rule out a direct effect, but it weakens the evidence.

If a similar study were done in another country with different patterns of contraceptive use and found similar results, the evidence would become stronger. A study with fewer women but more detailed information on mental and emotional health — such as one of the birth-cohort studies — might be able to say more about what leads up to episodes of depression in young women and might be able to say something about who is at most risk. There’s still going to be uncertainty.

So. It’s hard to say for sure. There is definitely some evidence that hormonal contraceptives increase the risk of depression. If the effect is real, it’s useful to know that it seems to be largely in women under 20, largely in the first year of use, and might be worse for the ‘mini-pill’ than the traditional pill.  There’s a lot already known — good and bad — about hormonal contraceptives, but this research paper does add something.

October 2, 2016


  • There has been some … free and frank exchange of views… this week on the question of criticising published research. The phrase “methodological terrorism” was used. Rather than linking to the combatants, I’ll give you Hilda Bastian and Jeff Leek (who have themselves had strongly-worded exchanges here and elsewhere)
  • Before analytics, businesses often had policies that every customer should be treated like they’re the best customer – because absent the data, the assumption was that every customer had that potential. But in the data age, there is no more benefit of the doubt.Cathy Carleton. Some people (mostly economists) will probably feel that this is all good. That’s  a defensible position, but poor service for the poor wasn’t a goal of the analytics system.
  • There are people here and in the US claiming that self-selected (‘bogus’) internet polls with no reweighting or modelling give useful information. Those people are wrong. Do not be those people.
September 25, 2016


  • A post from Minding Data looking at the proportion of syndicated stories in the Herald.  I’m not sure about the definition — some stories are edited here, and it’s not clear what it takes to not have an attribution to another paper.
  • On measuring the right numbers, from Matt Levine at Bloomberg View “The infamous number is that 5,300 Wells Fargo employees were fired for setting up fake customer accounts to meet sales quotas, but it is important — and difficult — to try to put that number in context. For instance: How many employees were fired for not meeting sales quotas because they didn’t set up fake accounts? “
  • Data Visualisation: how maps have shown elevation, from National Geographic — including why maps of European mountains are lit from the northwest,  rather from somewhere the sun might be. (via Evelyn Lamb)
  • I was Unimpressed when the authors of an unconvincing paper on GMO dangers had a ‘close-hold embargo’ — allowing journalists an advance look only if they promised not to get any expert input to their stories. It’s not any better when the FDA does it.
September 18, 2016

Yo mamma so smart

Q: Did you see intelligence is inherited just from mothers?

A: Yeah, nah.

Q: No, seriously. It’s in Stuff. “Recent scientific research suggests that rather than intelligence being genetically inherited from both their parents, it comes from their mother.”

A: I don’t think so.

Q: You’re objecting to their definition of intelligence, aren’t you?

A: Not this time. For today, I’m happy to stipulate to whatever their definition is.

Q: But they have Science! The “intelligence genes originate from the X chromosome” and “Some of these affected genes work only if they come from the mother. If that same gene is inherited from the father, it is deactivated.”

A: That sounds like two different explanations grafted together.

Q: Huh?

A: Some genes are imprinted so the paternal and maternal copies work differently, but that’s got nothing to do with the X chromosome.

Q: Why not?

A: Because any given cell has only one functioning X chromosome: for men, it comes from your mother, for women, it’s a random choice between the ones from each parent.

Q: Ok. But are all the intelligence genes on the X chromosome?

A: No. In fact, modern studies using hundreds of thousands of genetic variants suggest that genes contributing to intelligence are everywhere on the genome.

Q: But what about the ‘recent research’?

A: What recent research? I don’t see any links

Q: Maybe they’re in the blog post that the story mentions but doesn’t link to. Can you find it?

A: Yes.

Q: And the references?

A: Mostly in mice.

Q: But there’s one about a study in Glasgow, Scotland. In nearly 13,000 people.

A: There is, though it’s actually an analysis of the US National Longitudinal Study of Youth.  Which, strangely enough, did not recruit from Glasgow, Scotland. And less than half of the 12,686 participants ended up in the analysis.

Q: Whatever. It’s still recent research?

A: Ish. 2006.

Q: And it found mother’s intelligence was the most important predictor of child’s intelligence, though?

A: Yes, of the ones they looked at.

Q: So, more important than father’s intelligence?

A: That wasn’t one of the ones they looked at.

Q: “Wasn’t one of the ones they looked at”

A: Nope.

Q: Ok. So is there any reason for saying intelligence genes are on the X chromosome or is it all bollocks?

A: Both.

Q: ಠ_ಠ

A: Especially before modern genomics, it was much easier to find out about the effects of genes on the X chromosome, since breaking them will often cause fairly dramatic disorders in male children.

Q: So it’s not that more intelligence-related genes are on the X chromosome, just that we know more about them?

A: That could easily be the case. And just because a gene affects intelligence when it’s broken doesn’t necessarily mean small variations it in affect normal intelligence.

Q: But wouldn’t be it great if we could show those pretentious ‘genius’ sperm-donor organisations were all useless wankers?

A: On the other hand, we don’t need more reasons to blame mothers for their kids’ health and wellbeing.

September 17, 2016

Local polls

Since we have another episode of democracy coming on, there are starting to be more stories about polls for me to talk about.

First, the term “bogus”.  Two people, at least one of whom should have known better, have described poll results they don’t like as “bogus” recently. Andrew Little used the term about a One News/Colmar Brunton poll, and Nick Leggett said “If you want the definition of a bogus poll this is it” about results from Community Engagement Ltd.

As one of the primary NZ users of the term ‘bogus poll’ I want it to mean something. Bogus polls are polls that aren’t doing anything to get the right answer. For example, in the same Dominion Post story, Jo Coughlan mentioned

“…two independent Fairfax online Stuff polls of 16,000 and 3200 respondents showing me a clear winner on 35 per cent and 50 per cent respectively.”

Those are bogus polls.

So, what about the two Wellington polls cited as support for the candidates who sponsored them? Curia gives more detail than the Dominion Post.  The results differ by more than the internal margin of error, which will be partly because the target populations are different (‘likely voter’ vs ‘eligible’), and partly because the usual difficulties of sampling are made worse by trying to restrict to Wellington.

It wouldn’t be unreasonable to downweight the poll from Community Engagement Ltd just because seem to be a new company, but the polls agree the vote will go to preferences. That’s when things get tricky.

Local elections in NZ use Single Transferable Vote, so second and later preferences can matter a lot.  It’s hard to do good polling in STV elections even in places like Australia where there’s high turnout and almost everything really depends on the ‘two-party preferred’ vote — whether you rank Labor above or below the L/NP coalition.  It’s really hard when you have more than two plausible candidates, and a lot of ‘undecided’ voters, and a really low expected turnout.

With first-past-the-post voting the sort of posturing the candidates are doing would be important — you need to convince your potential supporters that they won’t be wasting their vote.  With STV, votes for minor candidates aren’t wasted and you should typically just vote your actual preferences, and if you don’t understand how this works (or if think you do and are wrong) you should go read Graeme Edgeler on how to vote STV.