Posts written by Thomas Lumley (2088)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

December 15, 2017

Big Fat Misinformation

Q: Did you see there’s a diet that makes you burn energy ten times faster?

A: That … doesn’t sound very likely.

Q: It’s in the Herald

A: But it’s also in the Daily Mail.

Q: You could look up the research paper

A: <sigh>


A: Ok. Here it is.

Q: That took a while.

A: The story didn’t give the names of any of the researchers.

Q: Did the diet make people burn energy ten times faster?

A: No

Q: Mice?

A: It was people, but they didn’t burn energy ten times faster

Q: Are you sure?

A: Here’s the graph from the research paper: RMR stands for ‘resting metabolic rate’ and the colors indicate the groups

Q: The red line is higher.  Is that the magic diet?

A: Yes.

Q: It’s not ten times higher

A: No

Q: Ten what, then?

A: The slope of the red line is ten times as steep as the slope of the other lines

Q: They all look kinda flat to me.

A: What’s ten times not a lot?

Q: Ok. Point.  The red line looks higher right from the start. The story says “They were randomly placed into three groups”

A: … “in the order they signed up for the study.”

Q: Well, you can’t randomly assign them before they sign up. Oh.  You mean they were just allocated to each group in turn.

A: Yes.

Q: Is that international best practice?

A: No.

Q: But does the diet work?

A: I don’t think the research adds much to what’s known about this question

Q: Which is?

A: Do you really think you’re going to get a simple and definitive solution to the low-carb diet controversy from a statistical blog?

Q: Ok, can I at least have some sort of sound bite?

A: Magic diet is not magic


Public comments, petitions, and other self-selected samples

In the US, the Federal Communications Commission was collecting public comments about ‘net neutrality’ — an issue that’s commercially and politically sensitive in a country where many people don’t have any real choice about their internet provider.

There were lots of comments: from experts, from concerned citizens, from people who’d watched a John Oliver show. And from bots faking real names and addresses on to automated comments.   The Wall Street Journal contacted a random sample of nearly 3000 commenters and found the majority of those they could get in contact with had not submitted the comment attached to their details.  The StartupPolicyLab attempted to contact 450,000 submitters, and got a response from just over 8000. Of the 7000 contacted about pro-neutrality comments, nearly all agreed they had made the comment, but of the 1000 responses about anti-neutrality comments, about 88% said they had not made the comment.

It’s obviously a bad idea to treat the comments as a vote. Even if the comments were from real US people, with one comment each, you’d need to do some sort of modelling of the vast majority who didn’t comment.  But what are they good for?

One real benefit is for people to provide ideas you hadn’t thought of.  The public comment process on proposed New Zealand legislation certainly allows for people like Graeme Edgeler to point out bugs in the drafting, and for people whose viewpoints were not considered to speak out.  For this, it doesn’t matter what the numbers of comments are, for and against. In fact, it helps if people who don’t have something to say don’t say it.

With both petitions and public comments there’s also some quantitative value in showing that concern about some issue you weren’t worrying about isn’t negligibly small; that thousands (in NZ) or hundreds of thousands (in the US) care about it.

But if it’s already established that an issue is important and controversial, and you care about the actual balance of public opinion, you should be doing a proper opinion poll.


  • From the LA Times, week before last The Los Angeles Police Department asked drivers to avoid navigation apps, which are steering users onto more open routes — in this case, streets in the neighborhoods that are on fire.  A related XKCD
  • From a Twitter thread about machine learning in the law (via Alex Hayes): For example, in crime prediction, X is not a sample from the population but instead from the subset of the population investigated by the police. Y is not the true outcome (e.g., did the person commit a crime), but the conclusion of the legal system. So a model predicting Y from X is predicting how the legal system will treat a person X that they have chosen to investigate. It is not predicting whether a person X’ drawn from the general population is guilty (Y’).  He’s not saying this is impossible to fix, but it won’t fix itself.
  • Wil Undy, who did the Herald’s poll aggregator in the NZ election, now has a blog.
  • A fascinating newly-discovered optical illusion
  • Interactive map of debt in the USA (via Alberto Cairo)
  • Thinking about ethics of machine-learning experiments such as determining sexual orientation from face images.
  • The Washington Post has exit polls for the Alabama senate election, including graphs.  I think it would be helpful to  show the size of different groups, not just how they voted

December 8, 2017

Attributing risk

Some time in the next week or so, we should be getting the ACC Christmas Sermon, where we get told about how many accidents happen on Christmas Day. From last year’s version in the Herald

Every year, more than 3400 claims are lodged with ACC for Christmas Day incidents, costing the country almost $3million.

As I always point out, that’s a lot less than the number lodged on a typical day that isn’t Christmas.  On the other hand, many of those 3400 are genuinely Christmas-caused injuries; accidents that would not have happened on some random day in summer.

You can look at Christmas-attributable risk by considering individual cases and counting the number that involve new toys, Christmas trees, batteries inserted in appropriately, misuse of wrapping paper, etc, etc. Or, you can compare Christmas to an otherwise similar day.

Rafa at Simply Statistics writes about a more serious example.

The official death toll from Hurricane María in Puerto Rico is 55. That’s 55 people whose death can be specifically and clearly attributed to the hurricane. However, the number of recorded deaths from all causes in September was 2838, which is 455 above the average for September in recent years. The next largest exceedance in the past seven years was just over 200 in November 2014.

Attributing deaths on a case-by-case basis to a disaster like María is hard; it would be hard to make those sorts of decisions even without the continuing post-hurricane disruption. Another example is deaths due to the 2003 power outage in New York, where there were 6 officially-attributed deaths but a spike of 90 in the total death statistics.

Sometimes we want to look at specifically attributable cases: when snow shuts down the roads, we probably want to count the number of snow-caused crashes without subtracting the number of snow-prevented ones. But for natural disasters it’s probably the total excess deaths we want.

December 7, 2017

If this goes on…


If you click through, things are less local and immediate: ATMs could be extinct in Australia within 30 years


A projection of data from the Reserve Bank of Australia by has found ATMs could be a distant memory in Australia by 2036.

2036 is in 19 years, and 19 is less than thirty, so I suppose that counts as within 30 years. So how did they do the projection? There’s not much detail in the story and I couldn’t find any on

The story says

According to, the number of ATM withdrawals per month has fallen from a high of 73 million in 2010 to just 47 million this year. If the trend continues at the same rate, ATM use will reach zero in three decades.

Now, I can fit a straight line to data. They teach you this in statistics. They often also teach you not to do it with just two points, but whatever

Ok, maybe had more data or more detailed data or something, but the information in the story is all we have, and it doesn’t really support either “2036” or “30 years”

I don’t know how long ATMs will last. And I don’t think does either. But they do know how to get a free mention in the Herald.

December 4, 2017

Compared to what?

From Stuff

Your summer pavlova costs more than 40 per cent more to make this year than it did 10 years ago – and commentators think that trend will continue.

That’s true, but prices now and prices ten years ago are in different currencies, and so shouldn’t just be compared like numbers.

Using the RBNZ inflation calculator, about half the apparent price increase is just currency conversion; a 2007 dollar is worth about 1.2 2017 dollars.

On top of that, incomes have changed over the past ten year. The median annual household income is up about 36%, so if pavlova is less affordable than in 2007 it’s mostly because of something like housing costs, not the price of cream and kiwifruit.

December 1, 2017


  • Testing drug sniffer dogs: “The dogs are mainly used to confirm what we already suspect,” says Fulmer. “When the dogs come out, about 99 percent of the time we get an alert. And it’s because we already know what’s in the car; we just need that confirmation to help us out with that.”  At least with the biosecurity beagles at Auckland Airport, there’s no incentive on the handler’s part for false positives.
  • Security researcher Matt Blaze talks to US Congress about voting security: “The most reliable and well-understood method to achieve this is through an approach called risk-limiting audits. In a risk limiting audit, a statistically significant randomized sample of precincts have their paper  ballots manually counted by hand and the results compared with the electronic tally. …The effect of risk-limiting audits is not to eliminate software vulnerabilities, but to ensure that the integrity of the election outcome does not depend on the herculean task of securing every software component in the system.” 
  • The Grattan Institute (in Australian) has a report (PDF) on adverse events in hospitals: Strengthening safety statistics: how to make hospital safety data more useful.  Peter Davis has an opinion piece in the Dominion Post on what NZ could do
  • Statistical population genetics of New York rats: they stick to their neighbourhoods, just like the humans. Sarah Zhang in the Atlantic.
  • ProPublica found that Facebook won’t let you target ads based on race, or even on ethnicity — but it will let you target “African American” under “Behaviors”, sub-category “Multicultural Affinity”.   Facebook said “The rental housing ads purchased by ProPublica should have but did not trigger the extra review and certifications we put in place due to a technical failure.” The last five words of that sentence are interesting — they don’t actually add anything, but they kind of sound like they do.
November 29, 2017

Expensive road is expensive

Under the headline Auckland residents prepared to pay to fix congested road, Bernard Orsman writes about an AA survey of residents of the Devonport peninsula.  There were three options for upgrading Lake Road, costing $10 million, $40 million, or $70 million.  Of the roughly half of respondents who preferred the $70 million proposal, about half (ie, 25% of people) were willing to pay targeted rates, and two-thirds of these (17%) were willing to pay $50 -$200/year.

“People are willing to pay something extra, but they want to see it happening faster as a result. AA members want to see benefits within the next five years – not 10,”

One calculation that’s left for the reader is how much would actually be needed to make up the difference between a $40m and $70m budget.  According to the local free paper, there are 8,328 households in the Devonport peninsula.  At $200/year each, that’s $1.6m/year.  Over ten years, that’s about half the gap — which might actually be reasonable.  On the other hand, $50/year each for 5 years is less than 10% of the gap — and most weren’t willing to support even this, in a survey where it didn’t actually cost anything.

November 23, 2017

State caricatures

This map of most disproportionately consumed Thanksgiving side dishes, from 538, is circulating again


As I’ve pointed out before, these aren’t the most commonly eaten in each state, they’re the ones that are most different from the rest of the country — a sort of caricature of the nation’s food geography. It’s actually worse than that, since this is from a relatively small poll and didn’t even record what state people were in, just what region.

Since 538 makes their data available, we can do other maps. Here’s the most commonly consumed side-dish


It’s much less interesting, but even this overstates the geographic variation.

Here, on a red-to-yellow heat scale, is the proportion of respondents who have mashed potatoes

and green beans/green bean casserole

There’s nothing necessarily wrong with the ‘most disproportionate’ map, as long as you recognise what it’s doing. But saying, as 538 did, “When you get past the poultry and check out the side dishes, though, the regional distinctions really come out” tends to hide that point.

More complicated than that

Science Daily

Computerized brain-training is now the first intervention of any kind to reduce the risk of dementia among older adults.

Daily Telegraph

Pensioners can reduce their risk of dementia by nearly a third by playing a computer brain training game similar to a driving hazard perception test, a new study suggests.

Ars Technica

Speed of processing training turned out to be the big winner. After ten years, participants in this group—and only this group—had reduced rates of dementia compared to the controls

The research paper is here, and the abstract does indeed say “Speed training resulted in reduced risk of dementia compared to control, but memory and reasoning training did not”

They’re overselling it a bit. First, these are intervals showing the ratios of number of cases with and without the three types of treatment, including the uncertainty


Summarising this as “speed training works but the other two don’t” is misleading.  There’s pretty marginal evidence that speed training is beneficial and even less evidence that it’s better than the other two.

On top of that, the results are for less than half the originally-enrolled participants, the ‘dementia’ they’re measuring isn’t a standard clinical definition, and this is a study whose 10-year follow-up ended in 2010 and that had a lot of ‘primary outcomes’ it was looking for — which didn’t include the one in this paper.

The study originally expected to see positive results after two years. It didn’t. Again, after five years, the study reported “Cognitive training did not affect rates of incident dementia after 5 years of follow-up.”  Ten-year results reported in 2014, showed relatively modest differences in people’s ability to take care of themselves, as Hilda Bastian commented.

So. This specific type of brain training might actually help. Or one of the other sorts of brain training they tried might help. Or, quite possibly, none of them might help.  On the other hand, these are relatively unlikely to be harmful, and maybe someone will produce an inexpensive app or something.