Posts filed under Risk (216)

March 29, 2017

Technological progress in NZ polling

From a long story at

For the first time ever, Newshub and Reid Research will conduct 25 percent of its polling via the internet. The remaining 75 percent of polling will continue to be collected via landline phone calls, with its sampling size of 1000 respondents and its margin of error of 3.1 percent remaining unchanged. The addition of internet polling—aided by Trace Research and its director Andrew Zhu—will aim to enhance access to 18-35-year-olds, as well as better reflect the declining use of landlines in New Zealand.

This is probably a good thing, not just because it’s getting harder to sample people. Relying on landlines leads people who don’t understand polling to assume that, say, the Greens will do much better in the election than in the polls because their voters are younger. And they don’t.

The downside of polling over the internet is it’s much harder to tell from outside if someone is doing a reasonable job of it. From the position of a Newshub viewer, it may be hard even to distinguish bogus online clicky polls from serious internet-based opinion research. So it’s important that Trace Research gets this right, and that Newshub is careful about describing different sorts of internet surveys.

As Patrick Gower says in the story

“The interpretation of data by the media is crucial. You can have this methodology that we’re using and have it be bang on and perfect, but I could be too loose with the way I analyse and present that data, and all that hard work can be undone by that. So in the end, it comes down to me and the other people who present it.”

It does. And it’s encouraging to see that stated explicitly.

November 26, 2016

Where good news and bad news show up

In the middle of last year, the Herald had a story in the Health & Wellbeing section about solanezumab, a drug candidate for Alzheimer’s disease. The lead was

The first drug that slows down Alzheimer’s disease could be available within three years after trials showed it prevented mental decline by a third.

Even at the time, that was an unrealistically hopeful summary. The actual news was that solanezumab had just failed in a clinical trial, and its manufacturers, Eli Lilly, were going to try again, in milder disease cases, rather than giving up.

That didn’t work, either.  The story is in the Herald, but now in the Business section. The (UK) Telegraph, where the Herald’s good-news story came from, hasn’t yet mentioned the bad news.

If you read the health sections of the media you’d get the impression that cures for lots of diseases are just around the corner. You shouldn’t have to read the business news to find out that’s not true.

November 4, 2016

Unpublished clinical trials

We’ve known since at least the 1980s that there’s a problem with clinical trial results not being published. Tracking the non-publication rate is time-consuming, though.  There’s a new website out that tries to automate the process, and a paper that claims it’s fairly accurate, at least for the subset of trials registered at  It picks up most medical journals and also picks up results published directly at — an alternative pathway for boring results such as dose equivalence studies for generics.

Here’s the overall summary for all trial organisers with more than 30 registered trials:


The overall results are pretty much what people have been claiming. The details might surprise you if you haven’t looked into the issue carefully. There’s a fairly pronounced difference between drug companies and academic institutions — the drug companies are better at publishing their trials.

For example, compare Merck to the Mayo Clinic
merck mayo

It’s not uniform, but the trend is pretty clear.


October 31, 2016

Give a dog a bone?

From the Herald (via Mark Hanna)

Warnings about feeding bones to pets are overblown – and outweighed by the beneficial effect on pets’ teeth, according to pet food experts Jimbo’s.


To back up their belief in the benefits of bones, Jimbo’s organised a three-month trial in 2015, studying the gums and teeth of eight dogs of various sizes.

Now, I’m not a vet. I don’t know what the existing evidence is on the benefits or harms of bones and raw food in pets’ diets. The story indicates that it’s controversial. So does Wikipedia, but I can’t tell whether this is ‘controversial’ as in the Phantom Time Hypothesis or ‘controversial’ as in risks of WiFi or ‘controversial’ as in the optimal balance of fats in the human diet. Since I don’t have a pet, this doesn’t worry me. On the other hand, I do care what the newspapers regard as reliable evidence, and Jimbo’s ‘Bone A Day’ Dental Trial is a good case to look at.

There are two questions at issue in the story: is feeding bones to dogs safe, and does it prevent gum disease and tooth damage? The small size of the trial limits what it can say about both questions, but especially about safety.  Imagine that a diet including bones resulted in serious injuries for one dog in twenty, once a year on average. That’s vastly more dangerous than anyone is actually claiming, but 90% of studies this small would still miss the risk entirely.  A study of eight dogs for three months will provide almost no information about safety.

For the second question, the small study size was aggravated by gum disease not being common enough.  Of the eight dogs they recruited, two scored ‘Grade 2’ on the dental grading, meaning “some gum inflammation, no gum recession“, and none scored worse than that.   Of the two dogs with ‘some gum inflammation’, one improved.  For the other six dogs, the study was effectively reduced to looking at tartar — and while that’s presumably related to gum and tooth disease, and can lead to it, it’s not the same thing.  You might well be willing to take some risk to prevent serious gum disease; you’d be less willing to take any risk to prevent tartar.  Of the four dogs with ‘Grade 1: mild tartar’, two improved.  A total of three dogs improving out of eight isn’t much to go on (unless you know that improvement is naturally very unusual, which they didn’t claim).

One important study-quality issue isn’t clear: the study description says the dental grading was based on photographs, which is good. What they don’t say is when the photograph evaluation was done.  If all the ‘before’ photos were graded before the study and all the ‘after’ photos were graded afterwards, there’s a lot of room for bias to creep in to the evaluation. For that reason, medical studies are often careful to mix up ‘before’ and ‘after’ or ‘treated’ and ‘control’ images and measure them all at once.  It’s possible that Jimbo’s did this, and that person doing the grading didn’t know which was ‘before’ and which was ‘after’ for a given dog. If before-after wasn’t masked this way, we can’t be very confident even that three dogs improved and none got worse.

And finally, we have to worry about publication bias. Maybe I’m just cynical, but it’s hard to believe this study would have made the Herald if the results had been unfavourable.

All in all, after reading this story you should still believe whatever you believed previously about dogfood. And you should be a bit disappointed in the Herald.

October 18, 2016

The lack of change is the real story

The Chief Coroner has released provisional suicide statistics for the year to June 2016.  As I wrote last year, the rate of suicide in New Zealand is basically not changing.  The Herald’s story, by Martin Johnston, quotes the Chief Coroner on this point

“Judge Marshall interpreted the suicide death rate as having remained consistent and said it showed New Zealand still had a long way to go in turning around the unacceptably high toll of suicide.”

The headline and graphs don’t make this clear

Here’s the graph from the Herald


If you want a bar graph, it should go down to zero, and it would then show how little is changing


I’d prefer a line graph showing expected variation if there wasn’t any underlying change: the shading is one and two standard deviations around the average of the nine years’ rates


As Judge Marshall says, the suicide death rate has remained consistent. That’s our problem.  Focusing on the year to year variation misses the key point.

June 22, 2016

Making hospital data accessible

From the Guardian

The NHS is increasingly publishing statistics about the surgery it undertakes, following on from a movement kickstarted by the Bristol Inquiry in the late 1990s into deaths of children after heart surgery. Ever more health data is being collected, and more transparent and open sharing of hospital summary data and outcomes has the power to transform the quality of NHS services further, even beyond the great improvements that have already been made.

The problem is that most people don’t have the expertise to analyse the hospital outcome data, and that there are some easy mistakes to make (just as with school outcome data).

A group of statisticians and psychologists developed a website that tries to help, for the data on childhood heart surgery.  Comparisons between hospitals in survival rate are very tempting (and newsworthy) here, but misleading: there are many reasons children might need heart surgery, and the risk is not the same for all of them.

There are two, equally important, components to the new site. Underneath, invisible to the user, is a statistical model that predicts the surgery result for an average hospital, and the uncertainty around the prediction. On top is the display and explanation, helping the user to understand what the data are saying: is the survival rate at this hospital higher (or lower) than would be expected based on how difficult their operations are?

June 14, 2016

Why everyone trusts us

  • In the UK, there’s been a big increase in the use of National Health Service data to track illegal immigrants — this was previously just done for serious criminals. (Buzzfeed)
  • CHICAGO — In this city’s urgent push to rein in gun and gang violence, the Police Department is keeping a list. Derived from a computer algorithm that assigns scores based on arrests, shootings, affiliations with gang members and other variables, the list aims to predict who is most likely to be shot soon or to shoot someone. New York Times
  • There’s a new UK website that does detailed analysis of your social media to tell your landlord whether you’ll be able to pay your rent. “If you’re living a normal life,” Thornhill reassures me, “then, frankly, you have nothing to worry about.” (Washington Post)
  • “We don’t turn people away,” Might said, but the cable company’s technicians aren’t going to “spend 15 minutes setting up an iPhone app” for a customer who has a low FICO score.  (fiercecable, via
  • Another startupclaims it can “reveal” your personality “with a high level of accuracy” just by analyzing your face, be that facial image captured via photo, live-streamed video, or stored in a database. It then sorts people into categories; with some labels as potentially dangerous such as terrorist or pedophile,” (also via
June 13, 2016

Reasonable grounds

Mark Hanna submitted an OIA request about strip searches in NZ prisons, which carried out with ‘reasonable grounds to believe’ the prisoner has an unauthorised item.  You can see the full response at FYI. He commented that 99.3% of these searches find nothing.

Here’s the monthly data over time:

The positive predictive value of having ‘reasonable grounds’  is increasing, and is up to about 1.5% now. That’s still pretty low. How ‘reasonable’ it is depends on what proportion of the time people who aren’t searched have unauthorised items: if that were, say, 1 in 1000, having ‘reasonable grounds’ would be increasing it 5-15-fold, which might conceivably count as reasonable.

We can look at the number of searches conducted, to see if that tells us anything about trends
Again, there’s a little good news: the number of strip searches has fallen over the the past couple of years. That’s a real rise and fall — the prison population has been much more stable. The trend looks very much like the first trend upside down.

Here’s the trend for number (not proportion) of searches finding something
It’s pretty much constant over time.

Statistical models confirm what the pictures suggest: the number of successful searches is essentially uncorrelated with the total number of searches. This is also basically good news (for the future, if not the past): it suggests that a further reduction in strip searches may well be possible at no extra risk.

May 24, 2016

Knowing what you’re predicting: drug war edition

From Public Address,

The woman was evicted by Housing New Zealand months ago after “methamphetamine contamination” was detected at her home. The story says it’s “unclear” whether the contamination happened during her tenancy or is the fault of a previous tenant.

There’s no allegation of a meth lab being run; the claim is that methamphetamine contamination is the result of someone smoking meth in the house.

The vendors claim the technique has no false positives, but even if we assume they are right about this they mean no false positives in the assay sense; that there definitely is methamphetamine in the sample.  The assay doesn’t guarantee that the tenant ‘allowed’ meth to be smoked in her house. And in this case it doesn’t even seem to guarantee that the contamination happened during her tenancy.

It’s not just this case and this assay, though those are bad enough. If predictive models are going to be used more widely in New Zealand social policy, it’s important that the evaluation of accuracy for those models is broader than just ‘assay error’, and considers the consequences in actual use.

May 4, 2016

Should you have bet on Leicester City?

As you know, Leicester City won the English Premier League this week. At the start of the season, you could get 5000:1 odds on this happening. Twelve people did.

Now, most weeks someone wins NZ Lotto first division, which pays more than 5000:1 for a winning ticket, and where we know the odds are actually unfavourable to the punter. The 5000:1 odds on their own aren’t enough to conclude the bookies had it wrong.  Lotto is different because we have good reasons to know that the probabilities are very small, based on how the numbers are drawn. With soccer, we’re relying on much weaker evidence.

Here’s Tim Gowers explaining why 5000:1 should have been obviously too extreme

The argument that we know how things work from following the game for years or even decades is convincing if all you want to prove is that it is very unlikely that a team like Leicester will win. But here we want to prove that the odds are not just low, but one-in-five-thousand low.

Professor Gowers does leave half the question unexamined, though

I’m ignoring here the well-known question of whether it is sensible to take unlikely bets just because your expected gain is positive. I’m just wondering whether the expected gain was positive.