Posts filed under General (1035)

October 24, 2016

Not the Nobel Prize for Statistics

Q: There isn’t a Nobel Prize for Statistics, is there?

A: No. We already talked about that.

Q: But there is a new big prize?

A: Yes, a group of five statistics organisations collaborated to create the “International Prize in Statistics

Q: And did someone win it?

A: Yes. To the vast surprise of no-one, it was won by Sir David Cox. (PDF)

Q: So what did he do?

A: He invented the Cox model. (And the other Cox model, but it was the Cox model he got the prize for.)

Q: And what is the Cox model?

A: It’s a regression model for censored time-to-event data. That is, you’re interested in modelling the time until something happens (death, unemployment,graduation) and you don’t get to observe the actual time for some people — they were still alive, employed, or studying when you stopped collecting data.

Q: That sounds useful. But why hadn’t someone already done it?

A: It was 1972.

Q: Oh.

A: And they had, it’s just Cox’s model was better in some ways. In particular, it didn’t make assumptions about the rate of events over time, just about how different groups of people compared.

Q: Um..

A: Consider smokers and non-smokers. The model might say smokers get cancer at ten times the rate of non-smokers, but not have to assume anything about how those rates change with age.  Earlier models would have assumed the rates were constant over time, or that they had simple mathematical forms.

Q: And they don’t?

A: Exactly.

Q: Ok, that sounds like a step forward. The model was popular, I suppose.

A: Yes, the paper presenting it has over 30,000 citations. It has more citations with a typo in the page number than my most-cited first-author statistics paper has in total.

Q: That many people have read it?

A: I didn’t say they’d read it. Nowadays, they mostly haven’t; they have read other papers or textbooks that mention it.

Q: So why hasn’t someone come up with a better model since 1972?

A: They have, but the Cox model is good enough to stay popular. And it was helped to popularity by being computationally well-behaved and mathematically interesting.

Q: Mathematically interesting?

A: The model is “semiparametric”: it has both rigid constraints (the ratios of rates are constant over time) and completely flexible parts (the pattern of events over time).  The estimator that Cox proposed is very simple, and in particular doesn’t involve estimating the flexible part of the model. It’s very unusual for that to work well, so mathematical statisticians wanted to study it and work out how to duplicate its success.

Q: And did they?

A: Not really. They understand how it works, but it’s not something you can make work in general. Cox was lucky and/or brilliant.

Q: Did Cox do anything else important?

A: Lots. He wrote or co-wrote 17 books on different areas of statistics, several of which became classics. He’s written a few hundred other research papers. He’s had 63 PhD students (he was my advisors’ advisor’s advisor’s advisor). And ..

Q: Ok, enough already. Where did he study statistics?

A: He didn’t really. He got a degree in maths (in two years, because there was a war on), then went to work for the Wool Industry Research Association before doing a PhD. Later, he moved to the US for 15 years because he couldn’t get a long-term job in Britain.

Q: Well, that part of his experience is still easy to duplicate in many countries.

A: Sadly, yes.

October 23, 2016

Psychic meerkats and Halloween masks

Prediction is hard — especially,  as the Danish proverb says, when it comes to the future. In the Rugby World Cup we had psychic meerkats. For the US elections the new bogus prediction trend is Halloween masks: allegedly, more masks are sold with the face of the candidate who goes on to win.

The first question with a claim like this one, especially given some of the people making it, is whether the historical claim is true.  In this case it’s true-ish.  The claim was made before the 2012 election, and while the data aren’t comprehensive, they are from the same big chain of stores each year. From 1980 to 2012, the mask rule has predicted the eventual winner of the presidency.  That’s actually an argument against it.

If there’s more to the mask sales than there is to psychic meerkats, it would have to be as a prediction of the popular vote — you’d need data from individual states to predict the weird US Electoral College. But if the mask rule got the 2000 election right, it must have got the popular vote wrong that year — George W. Bush won the electoral college, but lost the popular vote to Al Gore. From that point of view, we’re looking at 8 out of 9.

More importantly, 9 out of 9 isn’t all that impressive. Suppose you got your predictions by flipping a coin.  Your chance of getting either all heads for the Republican wins or all heads for the Democratic wins is 1 in 256, increasing to 1 in 128 if you’re allowed to choose which way to treat the 2000 election.  The chance of getting 8 of 9 agreement is much better: about 1 in 13.  If only one in a million people in the US had tried coming up with just one prediction rule each, you’d expect someone to get it perfect and dozens to get it nearly right.

Given these odds, it wouldn’t be surprising if, say, a US professional sports team had results agreeing with the Presidential results — and in fact, there was a rule based on the results for the Washington Redskins football team that worked from 1940 to 2000, was fudged to work in 2004, and then failed completely in 2012.    That’s 17/19 correct, but since the rule was first publicised in the run-up to the 2000 election, it’s 2/4 correct in actual use.

If you’re allowed to combine multiple variables it gets even easier to find rules. With anything from basic linear regression to a neural network you’d expect to get perfect prediction from five unrelated variables. Even restricting the models to be simple doesn’t help much.  I downloaded some OECD data on national GDP for various countries, and found that since 1980 the Republicans have won the popular vote precisely in years when the GDP of Sweden increased more than the GDP of Norway.

My advice is to stick with the psychic meerkats for entertainment and the opinion poll aggregators or the betting markets for prediction.

October 22, 2016

Stat of the Week fixed

Because of changes at WordPress, the Stat of the Week competition has been eating the URLs you submitted.




We’ve fixed it now.

Cheese addiction hoax again

Three more sites have fallen for the cheese addiction hoax

As you may remember, this story is very very loosely based on real research from the University of Michigan. However, the hoax version misrepresents which foods were most addictive and makes up an explanation based on the milk protein casein that isn’t mentioned in the real research at all.

The reason I’m calling this a hoax is that it wasn’t the fault of the researchers, their institution, or the journal, and it’s obvious to anyone who makes any attempt to scan the research paper that it doesn’t support the story. It isn’t an innocent mistake, and it isn’t a simple exaggeration like most misleading health science stories.

There’s a good post at Science News describing what was actually found.

October 20, 2016

Brute force and ignorance

At a conference earlier this week, a research team from Microsoft described a computer system for speech transcription. For the first time ever, this system did better than humans on a standard set of recordings.

What’s more impressive — and StatsChat relevant — is that this computer system does not understand anything about the conversations it writes down. The system does not know English, or any other human language, even in the sense that Siri does.

It has some preconceived notions about what tends to follow a particular word, pair of words, or triple of words, and about what sequences of sounds tend to follow each other, but nothing about nouns or verbs or how colorless green ideas sleep. As with modern image recognition, the system is just based on heaps and heaps of data and powerful computers.  It’s computing and statistics, not linguistics.

In a comment to a post at Language Log, the linguist Geoffrey Pullum says

I must confess that I never thought I would see this day. In the 1980s, I judged fully automated recognition of connected speech (listening to connected conversational speech and writing down accurately what was said) to be too difficult for machines, far more difficult than syntactic and semantic processing (taking an error-free written sentence as input, recognizing which sentence it was, analysing it into its structural parts, and using them to figure out its literal meaning). I thought the former would never be accomplished without reliance on the latter.

There are many problems where enough data is not available to construct a model with no understanding of the problem. There won’t be a shortage of work for human statisticians or linguists any time soon. But there are problems where brute force and ignorance works, and they aren’t always the ones we expect.

October 18, 2016

Mitre 10 Cup Predictions for the Mitre 10 Cup Semi-Finals

Team Ratings for the Mitre 10 Cup Semi-Finals

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Canterbury 14.27 12.85 1.40
Tasman 9.18 8.71 0.50
Taranaki 8.78 8.25 0.50
Auckland 6.55 11.34 -4.80
Counties Manukau 6.15 2.45 3.70
Otago 0.63 0.54 0.10
Waikato -0.37 -4.31 3.90
Wellington -0.86 4.32 -5.20
North Harbour -3.39 -8.15 4.80
Manawatu -3.94 -6.71 2.80
Bay of Plenty -4.43 -5.54 1.10
Hawke’s Bay -5.76 1.85 -7.60
Northland -13.35 -19.37 6.00
Southland -16.96 -9.71 -7.30


Performance So Far

So far there have been 70 matches played, 50 of which were correctly predicted, a success rate of 71.4%. Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 North Harbour vs. Tasman Oct 12 27 – 27 -8.30 FALSE
2 Taranaki vs. Auckland Oct 13 35 – 32 6.90 TRUE
3 Manawatu vs. Otago Oct 14 14 – 21 0.80 FALSE
4 Counties Manukau vs. Canterbury Oct 15 33 – 21 -7.70 FALSE
5 Hawke’s Bay vs. Bay of Plenty Oct 15 24 – 26 3.70 FALSE
6 Wellington vs. Waikato Oct 15 24 – 28 5.20 FALSE
7 Tasman vs. Southland Oct 16 56 – 0 25.20 TRUE
8 Northland vs. North Harbour Oct 16 28 – 44 -3.00 TRUE


Predictions for the Mitre 10 Cup Semi-Finals

Here are the predictions for the Mitre 10 Cup Semi-Finals. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Otago vs. Bay of Plenty Oct 21 Otago 9.10
2 Wellington vs. North Harbour Oct 22 Wellington 6.50
3 Canterbury vs. Counties Manukau Oct 23 Canterbury 12.10
4 Taranaki vs. Tasman Oct 23 Taranaki 3.60


Currie Cup Predictions for the Currie Cup Final

Team Ratings for the Currie Cup Final

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Lions 9.42 9.69 -0.30
Cheetahs 7.26 -3.42 10.70
Blue Bulls 4.82 1.80 3.00
Western Province 2.97 6.46 -3.50
Sharks 2.67 -0.60 3.30
Pumas -12.52 -8.62 -3.90
Griquas -12.69 -12.45 -0.20
Cavaliers -13.07 -10.00 -3.10
Kings -20.29 -14.29 -6.00


Performance So Far

So far there have been 37 matches played, 27 of which were correctly predicted, a success rate of 73%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blue Bulls vs. Western Province Oct 15 36 – 30 5.20 TRUE
2 Cheetahs vs. Lions Oct 15 55 – 17 -2.50 FALSE


Predictions for the Currie Cup Final

Here are the predictions for the Currie Cup Final. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Cheetahs vs. Blue Bulls Oct 22 Cheetahs 5.90


October 17, 2016


  • Beautiful weather maps from Ventusky, via Jenny Bryan
  • From BusinessInsider: 90% of executive board members think the ideal proportion of women on boards is higher than the current 20%, but the majority think it should still be 40% or less.
  • The Ministry for Social Development is collecting more data on people who use government-support community services. On one hand, they’re less likely to misuse it than a lot of internet companies; on the other hand, it might well deter people from seeking help. And while the Ministry is getting written consent, the people obtaining it won’t get paid by the Ministry if consent isn’t given.
  • If you only read one summary of the state of the US elections, the 538 update is a relatively painless and informative one.
  • People might be worrying too much about hackers (techy)

Moreover, we find that cyber incidents cost firms only a 0.4% of their annual revenues, much lower than retail shrinkage (1.3%), online fraud (0.9%), and overall rates of corruption, financial misstatements, and billing fraud (5%).


“Kind of” being an important qualifier here.

October 13, 2016

Weighting surveys

From the New York Times: “How One 19-Year-Old Illinois Man Is Distorting National Polling Averages”

There is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election.

He is sure he is going to vote for Donald J. Trump.

I think the story exaggerates the impact of this guy’s opinions on polling averages, but it’s a great illustration of one of the subtleties of polling.

Even in New Zealand, you often see people claiming, for example, that opinion polls will underestimate the Green Party vote because Green voters are younger and more urban, and so are less likely to have landline phones. As we see from the actual elections, that isn’t true. Pollers know about these simple forms of bias, and use weighting to fix them — if they poll half as many young voters as they should, each of their votes counts twice. Weighting isn’t as good as actually having a representative sample, but it’s ok — and unlike actually having a representative sample, it’s achievable.

One of the tricky parts of weighting is which groups to weight. If you make the groups too broadly-defined, you don’t remove enough bias; if you make them too narrowly-defined, you end up with a few people getting really extreme weights, making the sampling error much larger than it should be. That’s what happened here: the survey had one person in one of its groups, and that person turned out to be unusual. But it gets worse.

The impact of the weighting was amplified because this is a panel survey, polling the same people repeatedly. Panel surveys are useful because they allow much more accurate estimation of changes in opinions, but an unlucky sample will persist over many surveys.

Worse still, one of the weighting factors used was how people say they voted in 2012. That sounds sensible, but it breaks one of the key assumptions about weighting variables: you need to know the population totals.  We know the totals for how the population really voted in 2012, but reported vote isn’t the same thing at all — people are surprisingly unreliable at reporting how they voted in the past.

The actual impact on polling aggregators such as 538 is probably pretty small, since they model and try to remove ‘house effects’ (differences between surveys). However, the poll does give aid and comfort to people who don’t want to believe the consensus results, and that is not helpful.

October 11, 2016


  • A curriculum to help kids think critically about health claims has been developed — and is being evaluated in a randomised trial in Uganda (from Vox)
  • Someone else (the website Grub Street) has fallen for the cheese addiction hoax. I wrote here about how the story makes no sense.  There’s a post by SciCurious that includes an interview with one of the people behind the actual research, talking about how the story just isn’t supported by her work. We still don’t seem to know who is pushing the hoax version.
  • I was on RadioNZ’s Our Changing World, talking to Allison Ballance about means and medians
  • Using mathematics (or statistics) to help with art repair:  Ingrid Daubechies talks about her work.
  • From MBIE, an interactive map of NZ tourist numbers

This has been an urban legend in the UK — it’s true in Melbourne, though mostly because the Mt Waverley reservoir is a small storage buffer rather than main storage