Posts written by Thomas Lumley (2053)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

September 24, 2017

The polls

So, how did the polls do this time? First, the main result was predicted correctly: either side needs a coalition with NZ First.

In more detail, here are the results from Peter Ellis’s forecasts from the page that lets you pick coalitions.

Each graph has three arrows. The red arrow shows the 2014 results. The blue/black arrow pointing down shows the current provisional count and the implied number of seats, and the horizontal arrow points to Graeme Edgeler’s estimate of what the special votes will do (not because he claims any higher knowledge, but because his estimates are on a web page and explain how he did it).

First, for National+ACT+UnitedFuture


Second, for Labour+Greens


The result is well within  the uncertainty range of the predictions for Labour+Greens, and not bad for  National. This isn’t just because NZ politics is easy to predict: the previous election’s results are much further away. In particular, Labour really did gain a lot more votes than could reasonably have been expected a few months ago.


Update: Yes, there’s a lot of uncertainty. And, yes, that does  mean quoting opinion poll results to the nearest 0.1% is silly.

September 20, 2017

Democracy is coming

Unless someone says something really annoyingly wrong about polling in the next few days, I’m going to stop commenting until Saturday night.

Some final thoughts:

  • The election looks closer than NZ opinion polling is able to discriminate. Anyone who thinks they know what the result will be is wrong.
  • The most reliable prediction based on polling data is that the next government will at least need confidence and supply from NZ First. Even that isn’t certain.
  • It’s only because of opinion polling that we know the election is close. It would be really surprising if Labour didn’t do a lot better than the 25% they managed in the 2014 election — but we wouldn’t know that without the opinion polls.



Takes two to tango

Right from the start of StatsChat we’ve looked at stories about how men or women have more sexual partners. There’s another one in the Herald as a Stat of the Week nomination.

To start off, there’s the basic adding-up constraint: among exclusively heterosexual people, or restricted to opposite-sex partners, the two averages are necessarily identical over the whole population.

This survey (the original version of the story is here) doesn’t say that it just asked about opposite-sex partners, so the difference could be true.  On average, gay men have more sexual partners and lesbians have fewer sexual partners, so you’d expect a slightly higher average for all men than for all women.  Using binary classifications for trans and non-binary people will also stop the numbers matching exactly.

But there are bigger problems. First, 30% of women and 40% of men admit this is something they lie about. And while the rest claim they’ve never lied about it, well, they would, wouldn’t they?

And the survey doesn’t look all that representative.  The “Methodology” heading is almost entirely unhelpful — it’s supposed to say how you found the people, not just

We surveyed 2,180 respondents on questions relating to sexual history. 1,263 respondents identified as male with 917 respondents identifying as female. Of these respondents, 1,058 were from the United States and another 1,122 were located within Europe. Countries represented by fewer than 10 respondents and states represented by fewer than five respondents were omitted from results.

However, the sample is clearly not representative by gender or location, and the fact that they dropped some states and countries afterwards suggests they weren’t doing anything to get a representative sample.

The Herald has a bogus clicky poll on the subject. Here’s what it looks like on my desktop


On my phone it gets a couple more options visible, but not all of them. It’s probably less reliable than the survey in the story, but not by a whole lot.

This sort of story can be useful in making people more willing to talk about their sexual histories, but the actual numbers don’t mean a lot.

September 19, 2017


  • During the Cold War, there were a few occasions where a nuclear war could easily have started if one person hadn’t got in the way. One of those people was Stanislav Petrov. He died this week.
  • I saw a pharmacy in Ponsonby advertising “Ultrasound bone density screening for all ages”. There’s no way screening for osteoporosis makes sense ‘for all ages’, even if it was free (which it isn’t).
  • As I’ve mentioned a few times, the UK has an independent Statistics Authority whose chair is supposed to monitor and rebuke misuses of official statistics. The chair, Sir David Norgrove, criticised Boris Johnson over the £350m “savings” from Brexit he has kept repeating. We don’t have anything similar, sadly.
  • If you’re interested in the history of data journalism, you could do worse than reading Alberto Cairo’s PhD thesis. Dr Cairo is a former data journalist, current professor of visual journalism at the University of Miami, and one of next year’s Ihaka Lecture speakers here in Auckland.
  • Janelle Shane has a blog with examples of neural networks generalising from a wide range of inputs (recipes, hamster names, craft beers). Her current post is on D&D spell names, and shows the importance of a large input set for these networks: would you prefer your character to cast “Plonting Cloud” or “Wall of Storm”?
  • Kieran Healy, of Duke University, has an online book Data Visualization for Social Science. Yes, if you think you recognise the name, it’s him.
  • The American Statistical Association and the New York Times are partnering in a new monthly feature, “What’s Going On in This Graph?”

Denominators and BIGNUMs


It’s pretty obvious that Bon Appétit has just confused averages and totals here.

So, what is the average? There were about 75 million millennials in the US in 2016 (we can probably assume  Bon Appétit doesn’t care about other countries), so we’re looking at $1280/year, or about $25/week. Which actually seems pretty low as an average.  The US as a whole spent $1.46 trillion on food and beverages in 2014, which is about $4500/person/year or about $87/week.

As with so much generation-mongering, asking about the facts is missing the intended purpose of the story, which is to recycle some stereotypes about lazy/wasteful youth.

The story links to another, about a new book “Generation Yum”

Turow characterized the quintessential Millennial experience this way: “You got into a top tier high school, you hustled through college—you’ve done everything society told you—and you’re not rewarded. 

When “get into a top-tier high school” is a quintessential generational experience it’s clear we’re not even trying to go beyond unrepresentative stereotypes.  In which case, hold the numbers.

September 18, 2017

Another Alzheimer’s test

There’s a new Herald story with the lead

Artificial intelligence (AI) can identify Alzheimer’s disease 10 years before doctors can discover the symptoms, according to new research.

The story doesn’t link (even to the Daily Mail). Before we get to that, regular StatsChat readers will have some idea of what to expect.

Early diagnosis for Alzheimer’s is potentially useful when designing clinical trials for new treatments, and eventually will be useful for early treatment (when we get treatments that work).  But not yet.  It’s also not as much of a novelty as the story suggests. Candidate tests for early diagnosis are appearing all over the place (here’s seven of them).

Second, you’d expect that the accuracy of the test and its degree of foresight to have been exaggerated — and the story confirms this.

Following the training, the AI was then asked to process brains from 148 subjects – 52 were healthy, 48 had Alzheimer’s disease and 48 had mild cognitive impairment (MCI) but were known to have developed Alzheimer’s disease two and a half to nine years later.

That is, the early diagnosis wasn’t of people without symptoms, it was of people whose symptoms had led to a diagnosis but didn’t amount to dementia

The Herald doesn’t link, but Google finds a story at New Scientist, and they do link. The link is to the arXiv preprint server. That’s unusual: normally this sort of story is either complete vapour or is based on an article in a research journal.  This one is neither: it’s a real scientific report, but one that hasn’t yet been published — it’s probably undergoing peer review at the moment.

Anyway, the preprint is enough to look up the accuracy of the test. The sensitivity was high: nearly all Alzheimer’s cases and cases of Mild Cognitive Impairement were picked up. The specificity was terrible: more than 1/4 of people tested would receive a false positive diagnosis.

It’s possible that this test can be re-tuned into a genuinely useful clinical tool. As published, though, it isn’t even close.

But probably not

Q: Did you see icecream for breakfast may improve mental performance?


A: Pigs may fly

Q: But it’s a STUDY

A: That’s actually one of the questions left unresolved.

Q: Just follow the link. The International Business Times links to their source.

A: That link is to a Japanese news site. And it’s 404.

Q: Already? The tweet was just from this weekend.

A: The story is from November last year.

Q: But there’s a professor! Isn’t he real? Can’t you look at his publications.

A: Yes, he’s real. And he has publications. And they aren’t about icecream for breakfast.

Q: Back to the icecream. It could still be true, even if the data aren’t published, right?

A: Sure. In fact there’s a fair chance that, compared to no breakfast, icecream could improve mental performance.

Q: The comparison was to not eating anything?

A: It was compared to a glass of cold water.

Q: So, what does this tell us?

A: 2017 must be a slow news year.

September 17, 2017

Polls that aren’t any use


From last week in the Herald: 73.6 per cent of landlords plan rent rises if Labour wins. It’s been a while since I noticed a bogus-poll headline, but they keep coming back.

This time there are two independent reasons this number is meaningless.  First, it’s a self-selected survey — a bogus poll.  You can think of self-selected surveys as a type of petition: they don’t tell you anything useful about the people who didn’t respond, so the results are only interesting if the absolute number responding in a particular category is surprisingly high.  In this case, it’s 73.6% of 816 landlords. According to an OIA request in 2015,  there are more than 120,000 landlords in NZ, so we’re looking at a ‘yes’ response from less than half a percent of them.

Second, there’s an important distinction in polling questions to worry about.  If a nice pollster calls you up one evening and asks who you’re voting for, there’s no particular reason to say anything other than the truth.  The truth is the strongest possible signal of your political affiliation.  If a survey asks “will you raise rents if Labour gets in and raises costs?”,  it’s fairly natural to say “yes” as a sign that you don’t support Labour, whether it’s true or not. There’s no cost to saying “yes”, but if you’re currently setting rents at what you think is the right level, there is a cost to raising them.

Those of you who do arithmetic compulsively will have noticed another, more minor, problem with the headline.  There is no number of votes out of 816 that rounds correctly to 73.6%:  600/816 is 73.52941%, ie, 73.5% and 601/816 is 73.65196, ie, 73.7%.  And, of course, headlining the results of any poll, even a good one, to the nearest tenth of a percentage point is silly.

September 13, 2017

Thresholds and discards, again

There are competing explanations out there about what happens to votes for a party that doesn’t reach the 5%/1 electorate threshold.  This post is about why I don’t like one of them.

People will say (such as on NZ Morning Report this morning) that your votes are reallocated to other parties.  In some voting systems, such as the STV we use for local government elections, reallocating votes is a thing. Your voting paper literally (or virtually) starts off in one party’s pile and is moved to a different party’s pile.

That’s not what happens with the party votes for Parliament.  If the Greens don’t make 5%, party votes for the Greens are not used in allocating List seats.  It’s exactly as if those voters hadn’t cast a party vote, which I think is a simple enough explanation to use.

Now, in the vast majority of cases the result will be the same as if the votes had been reallocated in proportion — unless something weird like a tie happens at some stage in the counting — but one of the explanations is what happens and the other one isn’t.

If you think the two explanations convey the same meaning, you shouldn’t object to using the one that’s actually correct. And if you think they convey different meanings, you definitely shouldn’t object to using the one that’s actually correct.


September 10, 2017

Should there be an app for that?

As you may have heard, researchers at Stanford have tried to train a neural network to predict sexual orientation from photos. Here’s the Guardian‘s story.

Artificial intelligence can accurately guess whether people are gay or straight based on photos of their faces, according to new research that suggests machines can have significantly better “gaydar” than humans.

There are a few questions this should raise.  Is it really better? Compared to whose gaydar? And WTF would think this was a good idea?

As one comment on the study says

Finally, the predictability of sexual orientation could have serious and even life-threatening implications to gay men and women and the society asa whole. In some cultures, gay men and women still suffer physical and psychological abuse at the hands of governments, neighbors, and even their own families.

No, I lied. That’s actually a quote from the research paper (here). The researchers say this sort of research is ethical and important because people don’t worry enough about their privacy. Which is a point of view.

So, you might wonder about the details.

The data came from a dating website, using self-identified gender for the photo combined with the gender they were interested in dating to work out sexual orientation. That’s going to be pretty accurate (at least if you don’t care how bisexual people are classified, which they don’t seem to). It’s also pretty obvious that the pictures weren’t put up for the purpose of AI research.

The Guardian story says

 a computer algorithm could correctly distinguish between gay and straight men 81% of the time, and 74% for women

which is true, but is a fairly misleading summary of accuracy.  Presented with a pair of faces, one of which was gay and one wasn’t, that’s how accurate the computer was.  In terms of overall error rate, you can do better that 81% or 74% just by assuming everyone is straight, and the increase in prediction accuracy in random people over the human judgment is pretty small.

More importantly, these are photos from dating profiles. You’d expect dating profile photos to give more hints about sexual orientation than, say, passport photos, or CCTV stills.  That’s what they’re for.  The researchers tried to get around this, but they were limited by the mysterious absence of large databases of non-dating photos classified by sexual orientation.

The other question you might have is about the less-accurate human ratings.  These were done using Amazon’s Mechanical Turk.  So, a typical Mechanical Turk worker, presented only with a single pair of still photos, does do a bit worse than a neural network.  That’s basically what you’d expect with the current levels of still image classification: algorithms can do better than people who aren’t particularly good and who don’t get any particular training.  But anyone who thinks that’s evidence of significantly better gaydar than humans in a meaningful sense must have pretty limited experience of social interaction cues. Or have some reason to want the accuracy of their predictions overstated.

The research paper concludes

The postprivacy world will be a much safer and hospitable place if inhabited by well-educated, tolerant people who are dedicated to equal rights.

That’s hard to argue with. It’s less clear that normalising the automated invasion of privacy and use of personal information without consent is the best way to achieve this goal.