Posts written by Thomas Lumley (1944)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

January 28, 2017

Charms to soothe the savage beast

Q: Did you see dogs prefer reggae and soft rock?

A: Not rap?

Q: Rap? You mean because of the human voices? Or because of Snoop Dogg?

A: Um. Yes. Voices. Definitely the voices thing. Wouldn’t dream of the horrible pun.

Q: Anyway, how did they find out what sort of music the dogs liked? Did they give them buttons to push, like those experiments with rats?

A: No

Q: Did they see which speaker the dogs liked to sit near?

A: No.

Q: Can you work with me here?

A: They measured how relaxed the dogs were, by heart rate and whether they were lying down, and whether they were barking.

Q: The music they ‘liked most’ was really the music that made them lie down quietly and relax?

A: Yes.

Q: Have these people ever been teenagers?

A: To be fair, the research paper didn’t claim they were looking at preferences. That seems to be an invention of the press release.

Q: That would be the research paper that none of the stories linked, and most of them didn’t even hint at the existence of?

A: Yes, that one.

Q: So what were they really looking at?

A: The Scottish SPCA wants dogs to be quiet and relaxed (and presumptively  happy) in the kennels, while they’re waiting to find a new home.

Q: And soft rock and reggae were more relaxing than rap or thrash metal?

A: They didn’t look at all musical genres, just a few.  The dogs got a week of no music and a week with a different style each day (in random order, with music from Spotify).

Q: Soft rock and reggae were better than the other ones?

A: Well, Motown seemed to increase heart rate rather than decrease it, but the others were all pretty much the same.

Q: The others?

A:  Soft Rock, Reggae, Pop, Classical, Silence

Q: Wait, what? “Silence”?

A: Yes, a day of no music was about as good as a day of relaxing music.  It looks like variety might be the key. The researchers say

Interestingly, the physiological and behavioural changes observed in this study were maintained over the 5d of auditory stimulation, suggesting that providing a variety of different genres may help minimise habituation to auditory enrichment

Q: So what they really found is that playing dogs a variety of music relaxes them

A: Yes, but that’s not such a good headline.

January 20, 2017


Herald storySchool costs: $40,000 for ‘free’ state education

Last year’s Dom Post storyFamilies struggle to afford the rising cost of back-to-school requirements

Recycling last year’s StatsChat post:

So, it’s a non-random, unweighted survey, probably with a low response rate, among people signed up for an education-savings programme. You’d expect it to overestimate, but it’s not clear how much. Also

Figures have been rounded and represent the upper ranges that parents can reasonably expect to pay

It’s a real issue, but these particular numbers don’t deserve the publicity they get.

Measuring accuracy

Stuff  has a story “Scanner that can detect brain bleeds to be introduced in New Zealand.” The scanner uses infrared light to see relatively small accumulations of blood in the brain, with the aim of detecting bleeding quickly.  The traditional approach of looking at symptoms can often miss a bleed until after it’s done a lot of damage.

Accuracy is important for a device like this.  You don’t want to send lots of people off for CT scans, which are expensive and expose the patient to radiation, but you also don’t want to falsely reassure someone who really has a brain injury and who might then ignore later symptoms.

The story at Stuff claims 94% accuracy, but doesn’t say exactly what they mean by ‘accuracy’. Another story, at Gizmodo, says “A green light on the scanner gives the patient the all clear, and a red light shows a 90 per cent chance of haemorrhage.” The Gizmodo figures fit better with what’s on the manufacturer’s website, where they claim “Sensitivity = 88% / Specificity = 90.7%”.  That is, of people with (the right sort of) bleed, 88% will be detected, and of people without those bleeds, 90.7% will be cleared.

The Gizmodo story still confuses the forward and backwards probabilities. Out of 100 people with brain bleeds, 88 will get a red light on the machine. That’s not the same as their claim: that out of 100 people who get a red light on the machine, 90 have a bleed.

Suppose about 10% of the people it’s used on really do have brain bleeds. Out of an average 100 uses there would be 10 actual bleeds, 9 of whom would get a red light. There would be 90 without actual bleeds, about 9 of whom would get a red light.  So the red light would only indicate about a 50% chance of a haemorrhageThat’s still pretty good, especially as it can be done rapidly and safely, but it’s not 90%.

The other aspect of the story that’s not clear until you read the whole thing is what the news actually is. Based on the headline, you might think the point of the story is that someone’s started using this device in NZ, maybe in rugby or in ambulances, or is trialling it, or has at least ordered it.  But no.

No-one in New Zealand has yet got their hands on an infrascanner, but the hope is for it to be rolled out among major sporting bodies, public and private ambulance services, trauma centres and remote healthcare facilities.



January 18, 2017

Recognising te reo

Those of you on Twitter will have seen the little ‘translate this tweet’ suggestions that it puts up. If you’re from or in New Zealand you probably will have seen that reo Māori is often recognised by the algorithm as Latvian, presumably because Latvian also has long vowels indicated by macrons.   I’ve always been surprised by this, because Latvian looks so different.

It turns out I’m right.  Even looking just at individual letters, it’s very easy to distinguish the two.  I downloaded 74000 paragraphs of Latvian Wikipedia, a total of 6.5 million letters, and looked at how long the Latvians can go without using letters that don’t appear in te reo: specifically, s,z,j,v,d,c, g not as ng, the six accented consonants, and any consonant at the end of a word. On average, I only needed to wait five letters to know the language is Latvian rather than Māori, and 99% of the time it took less than 21 letters.

Another language that Twitter often guesses is Finnish. That makes more sense: many of the letters not used in Māori are also rare or absent in Finnish, and ‘g’ appears mostly as ‘ng’.   However, Finnish does have ‘s’, has ‘ä’ and ‘ö’, and ‘y’, and has words ending in consonants, so it should also be feasible to distinguish.


Update: Indonesian is another popular guess, but it has ‘d’,’j’,’y’,”b”, and it has lots of works ending with consonants.  The average time to rule out te reo is slightly longer, at nearly 6 characters, and the 99th percentile is 22 letters.  So if the algorithm can’t tell, it should probably guess it’s not Indonesian.

Update: For very short tweets, and those in mixed languages, nothing’s going to work, but this is about tweets where the answer is obvious to a human.

January 17, 2017


  • There’s a planned course at the University of Washington “Calling Bullshit in the Age of Big Data”. Here’s the website with syllabus and readings, and the Twitter account.
  • Via a tweet from ‘Calling Bullshit’, there’s a computer science preprint looking at distinguishing ‘criminals’ from ‘normal people’ using photographs.  I usually wouldn’t comment here on research papers that haven’t made it to the news, but this sentence was irresistible
    “Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages, having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc.
    An aim of both the course and this blog is to increase the number of people who find this sort of claim ridiculous.
  • For map nerds: a detailed cartographic comparison of Google Maps and Apple Maps.
  • Data journalism: the Guardian looks at the spatial concentration of gun violence in the US.
  • There’s a quote circulating widely now on social media “Journalism is  printing what someone else does not want printed. Everything else is public relations.” It’s being attributed to Orwell. He didn’t say it — which I think matters in this context.
    According to Quote Investigator, versions of it described as an ‘old saying’ were around in US journalism in the early 20th century. Later, in 1930, Walter Winchell attributed a version to William Randolph Hearst. More recently, it has been attributed to Lord Northcliffe, a UK pioneer of tabloid journalism. It wasn’t attributed to Orwell until the 1990s, decades after he died.

And finally: this is actually true

January 12, 2017

Measuring what you care about: turmeric edition


There’s a story on Stuff, with more detail at either Nature News or Scientific Americanthat turmeric doesn’t work. The original paper in the Journal of Medicinal Chemistry isn’t open access (actually, is), but its abstract is. It’s not new chemical research; it’s a review of what’s known about curcumin, the allegedly-active ingredient of turmeric, and why they don’t believe it.  In the opposite of the academic cliche, the point of the paper is to argue that less research is needed on curcumin and similar compounds.

StatsChat isn’t MedChemChat, but the paper is relevant for two reasons. First, turmeric is one of the foods that attracts low-quality, over-publicised research, which does end up on StatsChat. Second, the reason they don’t believe in turmeric is relevant.

Turmeric, if you believe the stories, appears to have pretty much every interesting biochemical effect anyone’s ever looked for.  That phenomenon has been seen before in medicinal chemistry, and the experience is that compounds which pass a huge range of screening tests tend to do it by cheating.

In 2010, two Australian chemists wrote a paper about “Pan-Assay INterference compounds” (PAINs) (abstract, story, blog post by another chemist). Most biologically interesting properties a compound might have aren’t visible to the naked eye. A lot of work goes into devising subtle and precise assays to measure them. A compound can mess up the assay and appear to pass the test without having the specific effect you’re looking for.  One important reason for PAIN is a compound that reacts with a wide range of proteins.

Turmeric, as you will no doubt have guessed, looks like a PAIN.  This nicely explains its excellent test-tube performance with its generally disappointing performance given as food to whole animals or people.  The researchers are arguing that turmeric seems to work in the lab because it cheats, and that it seems safe but less useful than hoped in people and animals mostly because it’s not absorbed well.

As the stories are careful to note, none of this definitively implies that curcumin (or some other tumeric ingredient) couldn’t have a beneficial effect, just that most of the evidence isn’t credible.   The same argument applies to some other trendy antioxidants.

It’s a recurrent theme on StatsChat that most data aren’t the real thing you care about. The speedometer needle position isn’t the same as speed; saliva THC concentration isn’t the same as impairment; methamphetamine traces on a wall aren’t the same as use — or manufacture– by a tenant; having a Chinese name isn’t the same as being an overseas housing speculator.  The map isn’t the territory.



Photo by Flickr user saptarshikar

January 11, 2017

If you’re a house

From the Herald

Nationwide 63.2 per cent of people today live in their own home – the lowest rate since the 61.2 per cent recorded at the 1951 Census – whereas 33 per cent live in a rental.

From Newstalk ZB

A shade over 63 percent of people today are living in their own home. 

That’s the lowest rate since 1951 when it was 61 percent.

From Newshub

Dwelling and household estimates data released on Tuesday shows that as of December 2016, 63.2 percent of people live in their own home.

One News don’t have text up yet, but their story has the same claim.

As David Welch points out in a stat-of-the-week nomination, that’s not what the number means: 63.2% is the percentage of homes occupied by at least one of their owners.  It’s the home ownership rate if you’re a house, rather than if you’re a person.

The proportion of people living in those households isn’t easy to work out — on one hand, single-person households tend to be renters; on the other hand, overcrowded households are often renters too.  StatsNZ does provide the proportion of individuals who own their home, which is rather lower, at about 50%. But that’s not the number the news stories want, either.  That’s the proportion of people 20 and older who, personally, own or part-own their homes. Living in a home owned by your parents, or your partner, or your child, doesn’t count.

That last sentence also illustrates why ‘home ownership’ is harder to define than you might think, just like unemployment.  Should a 22-year-old living with parents count towards home ownership? If not, should they count in the denominator as not home ownership, or should we just be looking at owning vs renting? How about an elderly person living with one of their children?

It would be helpful if the proportion of people living in owner-occupied households was published regularly, but it wouldn’t answer all the questions.  As an easier step, it would also be useful if the media accurately described the number they used.

Bogus poll stories, again

We have a headline today in the HeraldNew Zealand’s most monogamous town revealed“.

At first sight you might be worried this is something new that can be worked out from your phone’s sensor data, but no. It’s the result of a survey, and not even a survey of whether people are monogamous, but of whether they say they agree with the statement “I believe that monogamy is essential in a relationship” as part of the user data for a dating site that emphasises lasting relationships.

To make matters worse, this particular dating site’s marketing focuses on how different its members are from the general population.  It’s not going to be a good basis for generalising to “Kiwis are strongly in favour of monogamy

You can find the press release here (including the embedded map) and the dating site’s “in-depth article” here.

It’s not even that nothing else is happening in the world this week.

January 9, 2017

News to look forward to

Last year, we had a bunch of early-stage Alzheimer’s trials in the news. I thought I’d look at what’s due out in the clinical trial world this year.

Perhaps most importantly, in March we should see the first real results on a new set of cholesterol-lowering drugs.  The ‘PCSK9’ inhibitors are one of the first drugs outside the cancer world to come from large-scale genetic studies without a particular hypothesis in mind. As the gene name ‘PCSK9’ indicates to those in the know, the gene was originally named just as the ninth in a series of genes that looked similar in structure.  It turned out that mutations in PCSK9 had big effects on LDL (‘bad’) cholesterol levels. Also, importantly, there is at least one person walking around alive and healthy with disabling mutations in both her copies of the gene — so there was a good chance that inhibiting the protein would be safe.  At least three companies have drugs (monoclonal antibodies) that target PCSK9 and reduce cholesterol by a lot; though the drugs need to be given by intravenous injection.

Although the drugs have been shown to reduce cholesterol, and have been approved for sale in the US for people with very high cholesterol not otherwise treatable, they haven’t been shown to prevent heart attacks (which is the point of lowering your cholesterol). The first trial looking at that sort of real outcome has finished, and there’s a good chance the results will be presented at the American College of Cardiology meeting in March.  For people in NZ the main interest isn’t in the new treatments — it’s hard to see them being cost-effective initially — but in the impact on understanding cholesterol.  If these drugs do prevent heart attacks, they will increase our confidence that LDL cholesterol really is a cause of disease; if they don’t, they will give aid and comfort to the people who think cholesterol is missing the whole point.

What else? There are some interesting migraine trials due out: both using a new approach to prevention and using a new approach to giving the current treatments.  The prevention approach is based on inhibiting something called CGRP in the brain, which appears to be a key trigger; the drug is injected, but only every few months.  The treatment approach is based on a new sort of skin patch to try to deliver the ‘triptan’ drugs, which they hope will be as fast as inhaling or injecting them and less unpleasant.

Also, there’s an earlier-stage New Zealand biotech product that will have results early in the year: using cells from specially bred pigs, coated so the immune system doesn’t notice them, to treat Parkinson’s Disease.


January 8, 2017

The drug-driving problem

The AA are campaigning again for random drug tests of drivers. I’m happy to stipulate that in NZ lots of people smoke cannabis, and some of these people drive when stoned, and sometimes when drunk as well, and this is bad. As the ads say.

On the other hand, science has not yet provided us with a good biochemical roadside test for impairment from cannabis. For alcohol, yes. For THC, no. That’s even more of an issue in the US states where recreational marijuana use is legal, since the option of just taking away driving licences for anyone with detectable levels isn’t even there.

This isn’t just a point about natural justice. There’s empirical reason (though not conclusive) to believe that many people who might fail a biochemical test are reasonably careful about driving while high.

First, there hasn’t been any evidence of an increase in road deaths in the US states where medical or recreational marijuana use is legal, even though there has been an increase in people driving with detectable levels of the drug.

Second, if you look at the 2010 ESR report (PDF) that the AA are relying on, you find (p20)

The culpability of the drivers using cannabis by itself was determined and odds ratios have been calculated as described in the alcohol section and in Appendix two. The results are given in Table seven. The odds ratio calculated for cannabis only use is only slightly greater than one, implying that cannabis does not significantly impact on the likelihood of having a crash.

Now, the report says, correctly, that this disagrees with other evidence and that we shouldn’t assume driving while stoned is safe. But they tried quite hard to do alternative analyses showing cannabis was bad, and were unsuccessful.

In 2012, there was another AA campaign, and a story in the Herald

But Associate Minister of Transport Simon Bridges said the Government would wait for saliva testing technology to improve before using it.

A government review of the drug testing regime in May concluded the testing devices were not reliable or fast enough to be effective.

It ruled the saliva screening takes at least five minutes, is unlikely to detect half of cannabis users, and results are not reliable enough for criminal prosecution.

“The real factor is reliability … we can’t have innocent people accused of drug driving if they haven’t been.

“But as the technology improves, I’m sure in the future we will have a randomised roadside drug test.”

That seems like a sensible policy.