Posts written by Thomas Lumley (1249)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

August 22, 2014

Margin of error for minor parties

The 3% ‘margin of error’ usually quoted for poll is actually the ‘maximum margin of error’, and is an overestimate for minor parties. On the other hand, it also assumes simple random sampling and so tends to be an underestimate for major parties.

In case anyone is interested, I have done the calculations for a range of percentages (code here), both under simple random sampling and under one assumption about real sampling.


Lower and upper ‘margin of error’ limits for a sample of size 1000 and the observed percentage, under the usual assumptions of independent sampling

Percentage lower upper
1 0.5 1.8
2 1.2 3.1
3 2.0 4.3
4 2.9 5.4
5 3.7 6.5
6 4.6 7.7
7 5.5 8.8
8 6.4 9.9
9 7.3 10.9
10 8.2 12.0
15 12.8 17.4
20 17.6 22.6
30 27.2 32.9
50 46.9 53.1


Lower and upper ‘margin of error’ limits for a sample of size 1000 and the observed percentage, assuming that complications in sampling inflate the variance by a factor of 2, which empirically is about right for National.

Percentage lower upper
1 0.3 2.3
2 1.0 3.6
3 1.7 4.9
4 2.5 6.1
5 3.3 7.3
6 4.1 8.5
7 4.9 9.6
8 5.8 10.7
9 6.6 11.9
10 7.5 13.0
15 12.0 18.4
20 16.6 23.8
30 26.0 34.2
50 45.5 54.5

California drought visualisation


From XKCD. Both the data and the display technique are worth looking at



Presumably you could do something similar with New Zealand, which is roughly the same shape.

August 21, 2014

Auckland rates arithmetic

In today’s Herald story about increases in rates and impact on renters it’s not that the numbers are wrong, it’s that they haven’t been subjected to the right sorts of basic arithmetic.

The lead is

Auckland landlords are hiking rents amid fears of big rates increases next year on the back of spiralling property values.

and later on

Increases in landlords’ expenses, including rates, mortgage interest rates and insurance premiums, could push up rent on a three-bedroom Auckland house by between $20 and $40 a week, he said.

Including‘ is doing a lot of work in that sentence. The implications are particularly unfortunate in a story targeted at renters, who don’t get sent rates information directly and are less likely to know the details of  the system.

The first place to start is with a rough estimate of how much money we’re looking at. One of the few useful things the Taxpayers’ Union has done is to collate data on rates, hosted now at Stuff. The average Auckland rates bill was $2636.  That’s all residences, not three-bedroom houses, but the order of magnitude should be right. An annual bill of $2636 is $50/week. If the average total weekly rates payment is around $50, the average increase can’t reasonably be a big fraction of $20-$40/week or there’d be a lot more rioting in the streets.

Anyone who owns a house in Auckland or checks the Council website should know there is a cap on rates increases to cover the neighbourhoods where prices are increasing fastest. The cap is 10%/year; no rates increase faster than that, and most increase slower.  To get more detailed information you’d need to look at the website describing 2014/2015 rates changes, and find that the average increase for residential properties is 3.7%, then calculate that 3.7% of $50/week is about $2/week.

According to the Reserve Bank, both floating and two-year-fixed mortgage interest rates have gone up 0.5% since last year.  That’s $9.60/week per $100,000 of mortgage, so it’s likely to be a much bigger component of the rental cost increase than the rates are.

The average increase in rates is a lot slower than the increase in property prices (10% in the year to July), but you’d expect it to be. The council doesn’t set a fixed percentage of value from year to year and live with real-estate price fluctuations. It sets a budget for total rates income, and then distributes the cost using a combination of a fixed charge and a proportion of value. In other words, the increase in average real-estate prices in Auckland has no direct impact on average increase in rates — it’s just that if your house value has gone up more than average, your rates will tend to go up more than average.   Increases in average real-estate price obviously do lead to increases in rental price, but rates are not the mechanism.

The Council is currently working on a ten-year plan, including the total rates income over that period of time. It will be open for public comment in January.


August 20, 2014

Good neighbours make good fences

Two examples of neighbourly correlations, at least one of which is not causation

1. A (good) Herald story today, about research in Michigan that found people who got on well with their neighbours were less likely to have heart attacks

2. An old Ministry of Justice report showing people who told their neighbours whenever they went away were much less likely to get burgled.

The burglary story is the one we know is mostly not causal.  People who tell their neighbours whenever they go on holiday were about half as likely to have experienced a burglary, but only about one burglary in seven happened while the residents were on holiday. There must be something else about types of neighbourhoods or relationships with neighbours that explains most of the correlation.

I’m pretty confident the heart-disease story works the same way.  The researchers had some possible explanations

The mechanism behind the association was not known, but the team said neighbourly cohesion could encourage physical activities such as walking, which counter artery clogging and disease.

That could be true, but is it really more likely that talking to your neighbours makes you walk around the neighbourhood or work in the garden, or that walking around the neighbourhood and working in the garden leads to talking to your neighbours? On top of that, the correlation with neighbourly cohesion was rather stronger then the correlation previously observed with walking.

August 19, 2014

Fortune cookie endings

Or, often in NZ papers, “… in the UK”.

There’s a Herald story with the lead

More than 12,000 new cases of cancer every year can be attributed to the patient being overweight or obese, the biggest ever study of the links between body mass index and cancer has revealed.

Since there about about 20,000 new cases of cancer a year in NZ, that would be quite a lot.  The story never actually comes out and says the 12,000 is for the UK, but it is, and if you read the whole thing it becomes fairly clear.  It still seems the sort of context that a reader might find helpful.

“More maps that won’t change your mind about racism in America”



Ultimately, despite the centrality of social media to the protests and our ability to come together and reflect on the social problems at the root of Michael Brown’s shooting, these maps, and the kind of data used to create them, can’t tell us much about the deep-seated issues that have led to the killing of yet another unarmed young black man in our country. And they almost certainly won’t change anyone’s mind about racism in America. They can, instead, help us to better understand how these events have been reflected on social media, and how even purportedly global news stories are always connected to particular places in specific ways.

August 18, 2014

Health/nutrition claims: baby and bathwater

Australia and New Zealand are introducing new food labelling legislation that will reduce the scope for bogus health and nutrition claims (the only bogus claims allowed will be the ones that slipped into the official code).  This is a Good Thing, as I have said in the past.

The legislation also says you can’t make health claims about booze. This is probably a Good Thing, although I don’t see why calorie/carbohydrate claims shouldn’t be allowed.  However, there’s a serious bug in the standards: one of the claims that’s specifically disallowed for alcoholic beverages is “gluten-free.”

It’s true that “gluten-free” has become a trendy bogus nutrition claim, but it’s also vital health information for some people, particularly those with coeliac disease. In that context, “gluten-free” is more like an allergen warning (“May contain nuts”) than a nutrition warning.  In fact, if you look at the section on “Mandatory Warning and Advisory Statements and Declarations”, Clause 4 includes

Cereals containing gluten and their products, namely, wheat, rye, barley, oats and spelt and their hybridised strains other than where these substances are present in beer and spirits standardised in Standards 2.7.2 and 2.7.5 respectively

along with peanuts, soybeans, eggs, milk, etc.  That is, declaring the presence of gluten is mandatory except in beer, where it is the only one of the Clause 4 mandatory warnings that becomes forbidden.  Banning gluten-free labelling on beer is deliberate and planned, it didn’t just fall between the cracks.

Since this is a trans-Tasman law, it’s going to be a pain to revise.  There seems to be one possible loophole. In the Nutrition/Health claims standards, there is provision for endorsements by independent endorsing bodies. These are exempted from most of the health/nutrition regulations: as the Explanatory Text says:

Endorsements are exempt from the other requirements of the Standard (except clause 7), to allow for endorsement programs which use the criteria set by the endorsing body.

It appears (though I may have missed something, and I’m not a lawyer) that Coeliac New Zealand could still endorse gluten-free beers, even though the brewers couldn’t make the same claims themselves.

[Further update: MPI contacted Keruru Brewery and say they are now working on a solution for gluten-free beer.]

[update: I heard about this on Twitter, but the blog post that kicked off Twitter is here]

August 17, 2014

“Evidence”-based sentencing

Predictive risk scores for re-offending are increasingly used in the US. An opinion piece in the New York Times argues this is bad

The basic problem is that the risk scores are not based on the defendant’s crime. They are primarily or wholly based on prior characteristics: criminal history (a legitimate criterion), but also factors unrelated to conduct. Specifics vary across states, but common factors include unemployment, marital status, age, education, finances, neighborhood, and family background, including family members’ criminal history.



  • Jawbone (who make gadgets that tell you if you’re awake and walking around) have made some interesting graphics on sleep and activity in cities around the world.
  • the Slate Money podcast has some nice discussion of data science jobs, from a range of viewpoints (starting at about 24:45 — or listen to the whole thing and learn about Buzzfeed and about the payday loan industry)

Health evidence: quality vs quantity

From the Sunday Star-Times, on fish oil

Grey and colleague Dr Mark Bolland studied 18 randomised controlled trials and six meta-analyses of trials on fish oil published between 2005 and 2013. Only two studies showed any benefit but most media coverage of the studies was very positive for the industry.

On the other hand, the CEO of a fish-oil-supplement company disagrees

Keeley said more than 25,000-peer reviewed scientific papers supported the benefits of omega-3. “With that extensive amount of robust study to be then challenged by a couple of meta-analyses where negative reports are correlated together dumbfounds me.”

In fact, it happens all the time that large numbers of research papers and small experiments find something is associated with health then small numbers of large randomised trials show it doesn’t really help.  If it didn’t happen, medical and public health research would be much faster, cheaper, and more effective. I’m a coauthor on at least a couple of those 25000 peer-reviewed papers, and I’ve worked with people who wrote a bunch more of them, and I’m not dumbfounded. You don’t judge weight of evidence by literally weighing the papers.

Mr Keeley takes fish oil himself, and believes he will “live to 70, or 80 or 90 and not suffer from Alzheimer’s.”  That’s actually about what you’d expect without fish oil. He’s 60 now, so his statistical life expectancy is another 23 years, and by 83, less than 10% of people have developed dementia.

I wouldn’t say there was compelling evidence that fish-oil capsules are useless, but the weight of evidence is not in favour of them doing much good.