February 4, 2014

# What an (un)likely bunch of tosse(r)s?

It was with some amazement that I read the following in the NZ Herald:

Since his first test in charge at Cape Town 13 months ago, McCullum has won just five out of 13 test tosses. Add in losing all five ODIs against India and it does not make for particularly pretty reading.

Then again, he’s up against another ordinary tosser in MS Dhoni, who has got it right just 21 times out of 51 tests at the helm. Three of those were in India’s past three tests.

The implication of the author seems to be that five out of 13, or 21 out of 51 are rather unlucky for a set of random coin tosses, and that the possibility exists that they can influence the toss. They are unlucky if one hopes to win the coin toss more than lose it, but there is no reason to think that is a realistic expectation unless the captains know something about the coin that we don’t.

Again, simple application of the binomial distribution shows how ordinary these results are. If we assume that the chance of winning the toss is 50% (Pr(Win) = 0.5) each time, then in 13 throws we would expect to win, on average, 6 to 7 times (6.5 for the pedants). Random variation would mean that about 90% of the time, we would expect to see four to nine wins in 13 throws (on average). So McCullum’s five from 13 hardly seems unlucky, or exceptionally bad. You might be tempted to think that the same may not hold for Dhoni. Just using the observed data, his estimated probability of success is 21/51 or 0.412 (3dp). This is not 0.5, but again, assuming a fair coin, and independence between tosses, it is not that unreasonable either. Using frequentist theory, and a simple normal approximation (with no small sample corrections), we would expect 96.4% of sets of 51 throws to yield somewhere between 18 and 33 successes. So Dhoni’s results are somewhat on the low side, but they are not beyond the realms of reasonably possibility.

Taking a Bayesian stance, as is my wont, yields a similar result. If I assume a uniform prior – which says “any probability of success between 0 and 1 is equally likely”, and binomial sampling, then the posterior distribution for the probability of success follows a Beta distribution with parameters a = 21+ 1 = 22, and b = 51 – 21 + 1 = 31. There are a variety of different ways we might use this result. One is to construct a credible interval for the true value of the probability of success. Using our data, we can say there is about a 95% chance that the true value is between 0.29 and 0.55 – so again, as 0.5 is contained within this interval, it is possible. Alternatively, the posterior probability that the true probability of success is less than 0.5 is about 0.894 (3dp). That is high, but not high enough for me. It says there at about a 1 in 10 chance that the true probability of success could actually be 0.5 or higher.

James Curran's interests are in statistical problems in forensic science. He consults with forensic agencies in New Zealand, Australia, the United Kingdom, and the United States. He produces and maintains expert systems software for the interpretation of evidence. He has experience as an expert witness in DNA and glass evidence, appearing in courts in the United States and Australia. He has very strong interests in statistical computing, and in automation projects. See all posts by James Curran »

• Thomas Lumley

A uniform prior seems unduly generous in this setting. There’s fairly good prior evidence that coins have exactly two sides and that the probability is very close to equal for those sides.

Using a Beta(1000,1000) prior, which is still more diffuse than I think is warranted, the posterior probability that the success rate is less than 45% is less than 0.00001.

• James Curran

I agree, but I figured that if I chose anything other than a uniform prior I would be accused of being biased

• Great title James :)

Regarding the prior, I think the question implies something like a mixture of your two priors. If some kind of cheating or weird force were operating, there’d be no reason why the interval [0.4, 0.401] should be more probable than, say, [0.2, 0.201]. The beta does this but the mixture wouldn’t because it would be uniform in the tails.

• James Curran

Give me a mixture weight for the two and I will do the numbers (and pictures)

• Thomas Lumley

Actually, this is a case where I think the Bayes Factor might be a better summary than the posterior probability.

The Bayes Factor for point mass at 0.5 vs a uniform must be a reasonable approximation. My calculations say that 5 out of 13 is a Bayes factor of 2 in favour of the uniform, so barely worth mentioning. 21 out of 51 is actually evidence for p=0.5 over the uniform; but again barely worth mentioning.

• I get a BF of 2.2 in favour of p=0.5 for the data 5/13.

• Thomas Lumley

You’re right. I had it upside down. Must be because I’m in the northern hemisphere.

So, both sets of tosses are weak evidence against the results being influenced by mystic forces, corrupt bookmakers, the NSA, global warming, etc.