Stats Chat

January 28, 2017

Charms to soothe the savage beast

Q: Did you see dogs prefer reggae and soft rock?

A: Not rap?

Q: Rap? You mean because of the human voices? Or because of Snoop Dogg?

A: Um. Yes. Voices. Definitely the voices thing. Wouldn’t dream of the horrible pun.

Q: Anyway, how did they find out what sort of music the dogs liked? Did they give them buttons to push, like those experiments with rats?

A: No

Q: Did they see which speaker the dogs liked to sit near?

A: No.

Q: Can you work with me here?

A: They measured how relaxed the dogs were, by heart rate and whether they were lying down, and whether they were barking.

Q: The music they ‘liked most’ was really the music that made them lie down quietly and relax?

A: Yes.

Q: Have these people ever been teenagers?

A: To be fair, the research paper didn’t claim they were looking at preferences. That seems to be an invention of the press release.

Q: That would be the research paper that none of the stories linked, and most of them didn’t even hint at the existence of?

A: Yes, that one.

Q: So what were they really looking at?

A: The Scottish SPCA wants dogs to be quiet and relaxed (and presumptively happy) in the kennels, while they’re waiting to find a new home.

Q: And soft rock and reggae were more relaxing than rap or thrash metal?

A: They didn’t look at all musical genres, just a few. The dogs got a week of no music and a week with a different style each day (in random order, with music from Spotify).

Q: Soft rock and reggae were better than the other ones?

A: Well, Motown seemed to increase heart rate rather than decrease it, but the others were all pretty much the same.

Q: The others?

A: Soft Rock, Reggae, Pop, Classical, Silence

Q: Wait, what? “Silence”?

A: Yes, a day of no music was about as good as a day of relaxing music. It looks like variety might be the key. The researchers say

Interestingly, the physiological and behavioural changes observed in this study were maintained over the 5d of auditory stimulation, suggesting that providing a variety of different genres may help minimise habituation to auditory enrichment

Q: So what they really found is that playing dogs a variety of music relaxes them

A: Yes, but that’s not such a good headline.

View comments (1)

January 23, 2017

Stat of the Week Competition: January 21 – 27 2017

By Rachel Cunliffe

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday January 27 2017.
Statistics can be bad, exemplary or fascinating.
The statistic must be in the NZ media during the period of January 21 – 27 2017 inclusive.
Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

View comments (1)

January 20, 2017

Recycling

By Thomas Lumley

Herald story: School costs: $40,000 for ‘free’ state education

Last year’s Dom Post story: Families struggle to afford the rising cost of back-to-school requirements

Recycling last year’s StatsChat post:

So, it’s a non-random, unweighted survey, probably with a low response rate, among people signed up for an education-savings programme. You’d expect it to overestimate, but it’s not clear how much. Also

Figures have been rounded and represent the upper ranges that parents can reasonably expect to pay

It’s a real issue, but these particular numbers don’t deserve the publicity they get.

Stuff has a story “Scanner that can detect brain bleeds to be introduced in New Zealand.” The scanner uses infrared light to see relatively small accumulations of blood in the brain, with the aim of detecting bleeding quickly. The traditional approach of looking at symptoms can often miss a bleed until after it’s done a lot of damage.

Accuracy is important for a device like this. You don’t want to send lots of people off for CT scans, which are expensive and expose the patient to radiation, but you also don’t want to falsely reassure someone who really has a brain injury and who might then ignore later symptoms.

The story at Stuff claims 94% accuracy, but doesn’t say exactly what they mean by ‘accuracy’. Another story, at Gizmodo, says “A green light on the scanner gives the patient the all clear, and a red light shows a 90 per cent chance of haemorrhage.” The Gizmodo figures fit better with what’s on the manufacturer’s website, where they claim “Sensitivity = 88% / Specificity = 90.7%”. That is, of people with (the right sort of) bleed, 88% will be detected, and of people without those bleeds, 90.7% will be cleared.

The Gizmodo story still confuses the forward and backwards probabilities. Out of 100 people with brain bleeds, 88 will get a red light on the machine. That’s not the same as their claim: that out of 100 people who get a red light on the machine, 90 have a bleed.

Suppose about 10% of the people it’s used on really do have brain bleeds. Out of an average 100 uses there would be 10 actual bleeds, 9 of whom would get a red light. There would be 90 without actual bleeds, about 9 of whom would get a red light. So the red light would only indicate about a 50% chance of a haemorrhage. That’s still pretty good, especially as it can be done rapidly and safely, but it’s not 90%.

The other aspect of the story that’s not clear until you read the whole thing is what the news actually is. Based on the headline, you might think the point of the story is that someone’s started using this device in NZ, maybe in rugby or in ambulances, or is trialling it, or has at least ordered it. But no.

No-one in New Zealand has yet got their hands on an infrascanner, but the hope is for it to be rolled out among major sporting bodies, public and private ambulance services, trauma centres and remote healthcare facilities.

View comments (4)

January 18, 2017

Recognising te reo

By Thomas Lumley

Those of you on Twitter will have seen the little ‘translate this tweet’ suggestions that it puts up. If you’re from or in New Zealand you probably will have seen that reo Māori is often recognised by the algorithm as Latvian, presumably because Latvian also has long vowels indicated by macrons. I’ve always been surprised by this, because Latvian looks so different.

It turns out I’m right. Even looking just at individual letters, it’s very easy to distinguish the two. I downloaded 74000 paragraphs of Latvian Wikipedia, a total of 6.5 million letters, and looked at how long the Latvians can go without using letters that don’t appear in te reo: specifically, s,z,j,v,d,c, g not as ng, the six accented consonants, and any consonant at the end of a word. On average, I only needed to wait five letters to know the language is Latvian rather than Māori, and 99% of the time it took less than 21 letters.

Another language that Twitter often guesses is Finnish. That makes more sense: many of the letters not used in Māori are also rare or absent in Finnish, and ‘g’ appears mostly as ‘ng’. However, Finnish does have ‘s’, has ‘ä’ and ‘ö’, and ‘y’, and has words ending in consonants, so it should also be feasible to distinguish.

Update: Indonesian is another popular guess, but it has ‘d’,’j’,’y’,”b”, and it has lots of works ending with consonants. The average time to rule out te reo is slightly longer, at nearly 6 characters, and the 99th percentile is 22 letters. So if the algorithm can’t tell, it should probably guess it’s not Indonesian.

Update: For very short tweets, and those in mixed languages, nothing’s going to work, but this is about tweets where the answer is obvious to a human.

View comments (3)

January 17, 2017

Briefly

By Thomas Lumley

You’ve probably seen the ‘Big Mac’ index of purchasing power parity between countries. Kevin Buckland looked at whether you get similar results for other foods. You don’t.

There’s a planned course at the University of Washington “Calling Bullshit in the Age of Big Data”. Here’s the website with syllabus and readings, and the Twitter account.
Via a tweet from ‘Calling Bullshit’, there’s a computer science preprint looking at distinguishing ‘criminals’ from ‘normal people’ using photographs. I usually wouldn’t comment here on research papers that haven’t made it to the news, but this sentence was irresistible
“Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages, having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc.”
An aim of both the course and this blog is to increase the number of people who find this sort of claim ridiculous.

For map nerds: a detailed cartographic comparison of Google Maps and Apple Maps.
Related: “Out at the deep cultural level our system of certifying experts is badly busted; but this doesn’t mean we don’t need expertise. It just means that as a culture we’re listening to too many actors in white coats and con men selling colored water because we like what they say, instead of listening to oncologists and epidemologists because we need their knowledge.
But the cure for that is not to show greater respect for white coats; it’s to understand the basis for expertise, which is hard work that a lot of people have been refusing to do for decades, in both mainstream & various alternatives”
(from a series of tweets by John Barnes)
The US FDA is moving to close one of the loopholes that allowed Martin Shkreli to overcharge for decades-old medications: to market a new generic you have to show it’s close enough to the original drug, and he wouldn’t let other companies buy any of the original.

Data journalism: the Guardian looks at the spatial concentration of gun violence in the US.

There’s a quote circulating widely now on social media “Journalism is printing what someone else does not want printed. Everything else is public relations.” It’s being attributed to Orwell. He didn’t say it — which I think matters in this context.
According to Quote Investigator, versions of it described as an ‘old saying’ were around in US journalism in the early 20th century. Later, in 1930, Walter Winchell attributed a version to William Randolph Hearst. More recently, it has been attributed to Lord Northcliffe, a UK pioneer of tabloid journalism. It wasn’t attributed to Orwell until the 1990s, decades after he died.

And finally: this is actually true

NZTA statistics confirm that in 2/3 of multiple vehicle accidents, the person in charge of the other vehicle is at fault.

— ∆ Richard Law (@alphabeta_soup) January 13, 2017

January 16, 2017

Stat of the Week Competition: January 14 – 20 2017

By Rachel Cunliffe

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday January 20 2017.
Statistics can be bad, exemplary or fascinating.
The statistic must be in the NZ media during the period of January 14 – 20 2017 inclusive.
Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

January 12, 2017

Measuring what you care about: turmeric edition

By Thomas Lumley

There’s a story on Stuff, with more detail at either Nature News or Scientific American, that turmeric doesn’t work. The original paper in the Journal of Medicinal Chemistry i~~sn’t open access~~ (actually, is), but its abstract is. It’s not new chemical research; it’s a review of what’s known about curcumin, the allegedly-active ingredient of turmeric, and why they don’t believe it. In the opposite of the academic cliche, the point of the paper is to argue that less research is needed on curcumin and similar compounds.

StatsChat isn’t MedChemChat, but the paper is relevant for two reasons. First, turmeric is one of the foods that attracts low-quality, over-publicised research, which does end up on StatsChat. Second, the reason they don’t believe in turmeric is relevant.

Turmeric, if you believe the stories, appears to have pretty much every interesting biochemical effect anyone’s ever looked for. That phenomenon has been seen before in medicinal chemistry, and the experience is that compounds which pass a huge range of screening tests tend to do it by cheating.

In 2010, two Australian chemists wrote a paper about “Pan-Assay INterference compounds” (PAINs) (abstract, story, blog post by another chemist). Most biologically interesting properties a compound might have aren’t visible to the naked eye. A lot of work goes into devising subtle and precise assays to measure them. A compound can mess up the assay and appear to pass the test without having the specific effect you’re looking for. One important reason for PAIN is a compound that reacts with a wide range of proteins.

Turmeric, as you will no doubt have guessed, looks like a PAIN. This nicely explains its excellent test-tube performance with its generally disappointing performance given as food to whole animals or people. The researchers are arguing that turmeric seems to work in the lab because it cheats, and that it seems safe but less useful than hoped in people and animals mostly because it’s not absorbed well.

As the stories are careful to note, none of this definitively implies that curcumin (or some other tumeric ingredient) couldn’t have a beneficial effect, just that most of the evidence isn’t credible. The same argument applies to some other trendy antioxidants.

It’s a recurrent theme on StatsChat that most data aren’t the real thing you care about. The speedometer needle position isn’t the same as speed; saliva THC concentration isn’t the same as impairment; methamphetamine traces on a wall aren’t the same as use — or manufacture– by a tenant; having a Chinese name isn’t the same as being an overseas housing speculator. The map isn’t the territory.

Photo by Flickr user saptarshikar

January 11, 2017

If you’re a house

By Thomas Lumley

From the Herald

Nationwide 63.2 per cent of people today live in their own home – the lowest rate since the 61.2 per cent recorded at the 1951 Census – whereas 33 per cent live in a rental.

From Newstalk ZB

A shade over 63 percent of people today are living in their own home.

That’s the lowest rate since 1951 when it was 61 percent.

From Newshub

Dwelling and household estimates data released on Tuesday shows that as of December 2016, 63.2 percent of people live in their own home.

One News don’t have text up yet, but their story has the same claim.

As David Welch points out in a stat-of-the-week nomination, that’s not what the number means: 63.2% is the percentage of homes occupied by at least one of their owners. It’s the home ownership rate if you’re a house, rather than if you’re a person.

The proportion of people living in those households isn’t easy to work out — on one hand, single-person households tend to be renters; on the other hand, overcrowded households are often renters too. StatsNZ does provide the proportion of individuals who own their home, which is rather lower, at about 50%. But that’s not the number the news stories want, either. That’s the proportion of people 20 and older who, personally, own or part-own their homes. Living in a home owned by your parents, or your partner, or your child, doesn’t count.

That last sentence also illustrates why ‘home ownership’ is harder to define than you might think, just like unemployment. Should a 22-year-old living with parents count towards home ownership? If not, should they count in the denominator as not home ownership, or should we just be looking at owning vs renting? How about an elderly person living with one of their children?

It would be helpful if the proportion of people living in owner-occupied households was published regularly, but it wouldn’t answer all the questions. As an easier step, it would also be useful if the media accurately described the number they used.

Bogus poll stories, again

By Thomas Lumley

We have a headline today in the Herald “New Zealand’s most monogamous town revealed“.

At first sight you might be worried this is something new that can be worked out from your phone’s sensor data, but no. It’s the result of a survey, and not even a survey of whether people are monogamous, but of whether they say they agree with the statement “I believe that monogamy is essential in a relationship” as part of the user data for a dating site that emphasises lasting relationships.

To make matters worse, this particular dating site’s marketing focuses on how different its members are from the general population. It’s not going to be a good basis for generalising to “Kiwis are strongly in favour of monogamy”

You can find the press release here (including the embedded map) and the dating site’s “in-depth article” here.

It’s not even that nothing else is happening in the world this week.

Stats Chat

Charms to soothe the savage beast

Stat of the Week Competition: January 21 – 27 2017

Recycling

Measuring accuracy

Recognising te reo

Briefly

Stat of the Week Competition: January 14 – 20 2017

Measuring what you care about: turmeric edition

If you’re a house

Bogus poll stories, again

Recent comments

Popular posts

Latest posts

All topics

Recommended sites

Subscribe:

Receive our posts via email:

Recent comments

Popular posts

Latest posts

All topics

Recommended sites