Q: There isn’t a Nobel Prize for Statistics, is there?
Q: But there is a new big prize?
A: Yes, a group of five statistics organisations collaborated to create the “International Prize in Statistics”
Q: And did someone win it?
A: Yes. To the vast surprise of no-one, it was won by Sir David Cox. (PDF)
Q: So what did he do?
Q: And what is the Cox model?
A: It’s a regression model for censored time-to-event data. That is, you’re interested in modelling the time until something happens (death, unemployment,graduation) and you don’t get to observe the actual time for some people — they were still alive, employed, or studying when you stopped collecting data.
Q: That sounds useful. But why hadn’t someone already done it?
A: It was 1972.
A: And they had, it’s just Cox’s model was better in some ways. In particular, it didn’t make assumptions about the rate of events over time, just about how different groups of people compared.
A: Consider smokers and non-smokers. The model might say smokers get cancer at ten times the rate of non-smokers, but not have to assume anything about how those rates change with age. Earlier models would have assumed the rates were constant over time, or that they had simple mathematical forms.
Q: And they don’t?
Q: Ok, that sounds like a step forward. The model was popular, I suppose.
A: Yes, the paper presenting it has over 30,000 citations. It has more citations with a typo in the page number than my most-cited first-author statistics paper has in total.
Q: That many people have read it?
A: I didn’t say they’d read it. Nowadays, they mostly haven’t; they have read other papers or textbooks that mention it.
Q: So why hasn’t someone come up with a better model since 1972?
A: They have, but the Cox model is good enough to stay popular. And it was helped to popularity by being computationally well-behaved and mathematically interesting.
Q: Mathematically interesting?
A: The model is “semiparametric”: it has both rigid constraints (the ratios of rates are constant over time) and completely flexible parts (the pattern of events over time). The estimator that Cox proposed is very simple, and in particular doesn’t involve estimating the flexible part of the model. It’s very unusual for that to work well, so mathematical statisticians wanted to study it and work out how to duplicate its success.
Q: And did they?
A: Not really. They understand how it works, but it’s not something you can make work in general. Cox was lucky and/or brilliant.
Q: Did Cox do anything else important?
A: Lots. He wrote or co-wrote 17 books on different areas of statistics, several of which became classics. He’s written a few hundred other research papers. He’s had 63 PhD students (he was my advisors’ advisor’s advisor’s advisor). And ..
Q: Ok, enough already. Where did he study statistics?
A: He didn’t really. He got a degree in maths (in two years, because there was a war on), then went to work for the Wool Industry Research Association before doing a PhD. Later, he moved to the US for 15 years because he couldn’t get a long-term job in Britain.
Q: Well, that part of his experience is still easy to duplicate in many countries.
A: Sadly, yes.