Posts filed under Probability (66)

May 5, 2014

Verging on a borderline trend

From Matthew Hankins, via a Cochrane Collaboration blog post, the first few items on an alphabetical list of ways to describe failure to meet a statistical significance threshold

a barely detectable statistically significant difference (p=0.073)
a borderline significant trend (p=0.09)
a certain trend toward significance (p=0.08)
a clear tendency to significance (p=0.052)
a clear trend (p<0.09)
a clear, strong trend (p=0.09)
a considerable trend toward significance (p=0.069)
a decreasing trend (p=0.09)
a definite trend (p=0.08)
a distinct trend toward significance (p=0.07)
a favorable trend (p=0.09)
a favourable statistical trend (p=0.09)
a little significant (p<0.1)
a margin at the edge of significance (p=0.0608)
a marginal trend (p=0.09)
a marginal trend toward significance (p=0.052)
a marked trend (p=0.07)
a mild trend (p<0.09)

Often there’s no need to have a threshold and people would be better off giving an interval estimate including the statistical uncertainty.

The defining characteristic of the (relatively rare) situations where a threshold is needed is that you either pass the threshold or you don’t. A marked trend towards a suggestion of positive evidence is not meeting the threshold.

March 25, 2014

An ounce of diagnosis

The Disease Prevention Illusion: a tragedy in five parts, by Hilda Bastian

“An ounce of prevention is worth a pound of cure.” We’ve recognized the false expectations we inflate with the fast and loose use of the word “cure” and usually speak of “treatment” instead. We need to be just as careful with the P-word.

 

March 18, 2014

Your gut instinct needs a balanced diet

I linked earlier to Jeff Leek’s post on fivethirtyeight.com, because I thought it talked sensibly about assessing health news stories, and how to find and read the actual research sources.

While on the bus, I had a Twitter conversation with Hilda Bastian, who had read the piece (not through StatsChat) and was Not Happy. On rereading, I think her points were good ones, so I’m going to try to explain what I like and don’t like about the piece. In the end, I think she and I had opposite initial reactions to the piece from on the same starting point, the importance of separating what you believe in advance from what the data tell you. (more…)

February 13, 2014

How stats fool juries

Prof Peter Donnelly’s TED talk. You might want to skip over the first few minutes of vaguely joke-like objects

Consider the two (coin-tossing) patterns HTH and HTT. Which of the following is true:

  1. The average number of tosses until HTH is larger than the average number of tosses until HTT
  2. The average number of tosses until HTH is the same as  the average number of tosses until HTT
  3. The average number of tosses until HTH is smaller than the average number of tosses until HTT?

Before you answer, you should know that most people, even mathematicians, get this wrong.

Also, as Prof Donnelly doesn’t point out, if you have a programming language handy, you can find out the answer very easily.

February 4, 2014

What an (un)likely bunch of tosse(r)s?

It was with some amazement that I read the following in the NZ Herald:

Since his first test in charge at Cape Town 13 months ago, McCullum has won just five out of 13 test tosses. Add in losing all five ODIs against India and it does not make for particularly pretty reading.

Then again, he’s up against another ordinary tosser in MS Dhoni, who has got it right just 21 times out of 51 tests at the helm. Three of those were in India’s past three tests.

The implication of the author seems to be that five out of 13, or 21 out of 51 are rather unlucky for a set of random coin tosses, and that the possibility exists that they can influence the toss. They are unlucky if one hopes to win the coin toss more than lose it, but there is no reason to think that is a realistic expectation unless the captains know something about the coin that we don’t.

Again, simple application of the binomial distribution shows how ordinary these results are. If we assume that the chance of winning the toss is 50% (Pr(Win) = 0.5) each time, then in 13 throws we would expect to win, on average, 6 to 7 times (6.5 for the pedants). Random variation would mean that about 90% of the time, we would expect to see four to nine wins in 13 throws (on average). So McCullum’s five from 13 hardly seems unlucky, or exceptionally bad. You might be tempted to think that the same may not hold for Dhoni. Just using the observed data, his estimated probability of success is 21/51 or 0.412 (3dp). This is not 0.5, but again, assuming a fair coin, and independence between tosses, it is not that unreasonable either. Using frequentist theory, and a simple normal approximation (with no small sample corrections), we would expect 96.4% of sets of 51 throws to yield somewhere between 18 and 33 successes. So Dhoni’s results are somewhat on the low side, but they are not beyond the realms of reasonably possibility.

Taking a Bayesian stance, as is my wont, yields a similar result. If I assume a uniform prior – which says “any probability of success between 0 and 1 is equally likely”, and binomial sampling, then the posterior distribution for the probability of success follows a Beta distribution with parameters a = 21+ 1 = 22, and b = 51 – 21 + 1 = 31. There are a variety of different ways we might use this result. One is to construct a credible interval for the true value of the probability of success. Using our data, we can say there is about a 95% chance that the true value is between 0.29 and 0.55 – so again, as 0.5 is contained within this interval, it is possible. Alternatively, the posterior probability that the true probability of success is less than 0.5 is about 0.894 (3dp). That is high, but not high enough for me. It says there at about a 1 in 10 chance that the true probability of success could actually be 0.5 or higher.

January 19, 2014

How to beat Lotto

That is, how to gamble in a way that over a course of a year, gives you a higher chance at a larger payout than playing NZ Lotto each week and hoping for Division 1. We all know you can’t “beat Lotto” in the usual sense of improving your odds of winning.

In the ordinary Saturday Lotto, you pick 6 numbers out of 40, and if all 6 are correct (which they aren’t) you win $1 million. The chance of winning is 1 in 3838380 per ‘line’. Suppose you play the minimum of 4 lines, for $6, each week for a year. The chance of winning in a year is one in 18453.75. That is, on average you’d expect to win once in every 18453 years and 9 months.

Alternatively, suppose you save up the $6 per week, and then at the end of the year go to a casino and play roulette.  Put it all on a single number.  If you win, put it all on a single number again, and then if you win,  put it all on a ‘double street’ of six numbers.  Your chance of winning (in double-zero roulette) is 1 in 9145.33, and if you win you will make $2426112.

So, you get twice the chance of winning as you would have for Lotto division 1, and more than twice the payout. The expected return is 85%, much better than the 56% that NZ Lotteries returns (averaged over all its games, annual report).  Does that mean it’s a good idea? No. Not even slightly.  You have a 37 in 38 chance of turning up with $300 and losing it in a few minutes. If you don’t, you have a 37 in 38 chance of losing $7500 in the next few minutes, and if you don’t, you have about an 85% chance of losing more than quarter of a million dollars.   This strategy makes your losses obvious, which makes gambling no fun. And you still only win once every 91 centuries.

Enjoyable gambling, including Lotto, is based on making your losses less obvious by masking them with small wins and stretching them out over time. Of course, that’s also what makes gambling, including Lotto, potentially addictive.

January 2, 2014

Toll, poll, and tolerance.

The Herald has a story that  has something for everyone.  On the front page of the website it’s labelled “Support for lower speed limit“, but when you click through it’s actually about the tighter tolerance (4km/h, rather than 10km/h) for infringement notices being used on the existing speed limits.

The story is about a real poll, which found about 2/3 support for the summer trial of tighter speed limits. Unfortunately, the poll seems to have had really badly designed questions. Either that, or the reporting is jumping to unsupportable conclusions:

The poll showed that two-thirds of respondents felt that the policy was fair because it was about safety. Just 29 per cent said that it was unfair and was about raising revenue.

That is, apparently the alternatives given for respondents combined both whether they approved of the policy and what they thought the reason was.  That’s a bad idea for two reasons. Firstly, it confuses the respondents, when it’s hard enough getting good information to begin with. Secondly, it pushes them towards an answer.   The story is decorated with a bogus clicky poll, which has a better set of questions, but, of course, largely meaningless results.

The story also quotes the Police Minister attributing a 25% lower death toll during  the Queen’s Birthday weekends to the tighter tolerance

“That means there is an average of 30 people alive today who can celebrate Christmas who might not otherwise have been,” Mrs Tolley said.

We’ve looked at this claim before. It doesn’t hold up. Firstly, there has been a consistently lower road toll, not just at holiday weekends.  And secondly, the Ministry of Transport says that driving too fast for the conditions is a only even one of the contributing factors in 29% of fatal crashes, so getting a 25% reduction in deaths just from tightening the tolerance seems beyond belief.  To be fair, the Minister only said the policy “contributed” to the reduction, so even one death prevented would technically count, but that’s not the impression being given.

What’s a bit depressing is that none of the media discussion I’ve seen of the summer campaign has asked what tolerance is actually needed, based on accuracy of speedometers and police speed measurements. And while stories mention that the summer campaign is a trial run to be continued if it is successful, no-one seems to have asked what the evaluation criteria will be and whether they make sense.

(suggested by Nick Iversen)

October 9, 2013

Bell curves, bunnies, and dragons

Keith Ng points me to something that’s a bit more technical than we usually cover here on StatsChat, but it was in the New York Times, and it does have  redeeming levels of cutesiness: an animation of the central limit theorem using bunnies and dragons

The point made by the video is that the Normal distribution, or ‘bell curve’, is a good approximation to the distribution of averages even when it is a very poor approximation to the distribution of individual measurements.  Averaging knocks all the corners off a distribution, until what is left can be described just by its mean and spread.  (more…)

October 8, 2013

Death rate bounce coming?

A good story in Stuff today about mortality rates.

A Ministry of Health report shows while death rates are as low as they have even been since mortality data was collected, men are far more likely to die of preventable causes than women.

Heart Foundation medical director Professor Norman Sharpe said it is a gap that will continue to widen as a “new wave” of health problems caused by obesity start showing up in the statistics.

The latest mortality data, gathered from death certificates and post-mortem examinations, shows there were 28,641 deaths registered in New Zealand in 2010.

While the number of actual deaths is increasing, up 8 per cent since 1990, this was because of a growing and ageing population.

Death rates overall have dipped about 35 per cent, meaning statistically we are more likely to survive to a ripe old age.

There aren’t any of the problems I complained about in last year’s story on this topic: there’s a clear distinction between increases in rates and the impact of population size and aging, and the story admits that the problems with preventable deaths it raises are projections for the future.

While on this topic, I will point out a useful technical distinction between rates and risks.  Risks are probabilities; they don’t have any units and are at most 100%. Lifetime risks of death are exactly 100%, and are neither increasing nor decreasing.  Rates are probabilities for an interval of time; they do have units (eg % per year). Rates of death can increase or decrease, as the one death per customer is spread out over shorter or longer periods of time.

September 28, 2013

Gambling at 19-1

The IPCC report is out. We know the earth has been getting hotter: that’s just simple data analysis. The report says

It is extremely likely that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together. The best estimate of the human induced contribution to warming is similar to the observed warming over this period. 

Here, “extremely likely” is defined as 95-100% confidence. Since we (fortunately) don’t get a long series of potential climate catastrophes to average over, the probabilities have to be interpreted in terms of (reasonable) degrees of belief rather than relative frequency, which can be made concrete by equivalents to investment or gambling.

That is, the panel concludes no-one should be betting against a human cause for climate change unless they get better than 19-1 odds (and possibly much better, depending on where in the 95-100% range they are).  Suppose we have an opportunity to reduce greenhouse gas concentrations, which will cost $20 million, and that the money is completely wasted if the climate models are basically wrong, but which will bring in $21 million, for a $1 million profit, if the models are basically right. The evaluation as “extremely likely” means we should take these opportunities.  Investments that have, say, a net loss of $10 million if there isn’t anthropogenic warming and a net saving of $1 million if there is, are very good value.  For mitigation efforts, the odds are even more favourable: the world unquestionably has been warming, so mitigation is likely to be worthwhile even if the reason isn’t CO2.

I don’t think current policies are anywhere near the 19-1 threshold. I’d be surprised if a lot of them even made sense  if the climate was offering even money.