Posts written by Thomas Lumley (2031)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

August 1, 2017

Holiday travel trends

The Herald has a story and video graphic, and a nice interactive graphic on international travel by Kiwis since 1979.  The story is basically good (and even quotes a price corrected for inflation).

Here’s one frame of the video graphic

First, a lot of the world isn’t coloured. There are New Zealanders who have visited say, Germany or Turkey or Egypt, even though these countries never make it into the 1-24,999 colour category. It looks as if the video picks a set of 16 countries and follows just those forward in time: we’re not told how these were picked.

Second, there’s the usual map problem of big things looking big (exacerbated by the Mercator projection). In 1999, more people went to Fiji than the US; more to Samoa than France. A map isn’t good at making these differences visually obvious, though the animation helps. And, tangentially, if you’re going to use almost a third of the map real estate on the region north of 60°, you should notice that Alaska is part of the USA.

The other, more important, issue that’s common to the whole presentation (and which I understand is being updated at the moment) is what the country data actually mean. It seems that it really is holiday data, excluding both business and visiting friends/relatives (comparing the video to this from Figure.NZ), but it’s by “country of main destination”.  If you go to more than one country, only one is counted.  That’s why the interactive shows zero Kiwis travelling to the Vatican City, and it may help explain numbers like 300 for Belgium.

Official statistics usually measure something fairly precise, but it’s not always the thing that you want them to measure.

July 30, 2017

Coffee news?

In 2015, the Herald said

Drinking the caffeine equivalent of more than four espressos a day is harmful to health, especially for minors and pregnant women, the European Union food safety agency has said.

“It is the first time that the risks from caffeine from all dietary sources have been assessed at EU level,” the EFSA said, recommending that an adult’s daily caffeine intake remain below 400mg a day.

(I quoted it at the time: the link seems to be dead now).

Now we have, under the headline Good news for coffee lovers: Caffeine is harmless, says research

A review of 44 trials dispelled the widespread myth that caffeine, found in tea, coffee and fizzy drinks, is bad for the body.

It found that sticking to the recommended daily amount of 400mg – the equivalent four cups of coffee or eight cups of tea – has no lasting damage on the body.

The recommendation that 400mg/day is generally safe was described as ‘caffeine is dangerous’ in 2015 and ‘caffeine is harmless’ now.

Other not-news about this is the not-new research. Obviously the Daily Mail (the only link) isn’t a research source. The research was published in Complete Nutrition, a professional magazine for UK dieticians. As their website says

Each issue of CN is packed with articles which are practical, educational and topical, and all are written by independent, well-respected authors from across the profession.

That’s a valuable mission for a journal, but it would be surprising if an expert opinion article in a journal like that contained new research worth international headlines.

What are election polls trying to estimate? And is Stuff different?

Stuff has a new election ‘poll of polls’.

The Stuff poll of polls is an average of the most recent of each of the public political polls in New Zealand. Currently, there are only three: Roy Morgan, Colmar Brunton and Reid Research. 

When these companies release a new poll it replaces their previous one in the average.

The Stuff poll of polls differs from others by giving weight to each poll based on how recent it is.

All polls less than 36 days old get equal weight. Any poll 36-70 days old carries a weight of 0.67, 70-105 days old a weight 0.33 and polls greater than 105 days old carry no weight in the average.

In thinking about whether this is a good idea, we’d need to first think about what the poll is trying to estimate and about the reasons it doesn’t get that target quantity exactly right.

Officially, polls are trying to estimate what would happen “if an election were held tomorrow”, and there’s no interest in prediction for dates further forward in time than that. If that were strictly true, no-one would care about polls, since the results would refer only to the past two weeks when the surveys were done.

A poll taken over a two-week period is potentially relevant because there’s an underlying truth that, most of the time, changes more slowly than this.  It will occasionally change faster — eg, Donald Trump’s support in the US polls seems to have increased after James Comey’s claims about Clinton’s emails in the US, and Labour’s support in the UK polls increased after the election was called — but it will mostly change slower. In my view, that’s the thing people are trying to estimate, and they’re trying to estimate it because it has some medium-term predictive value.

In addition to changes in the underlying truth, there is the idealised sampling variability that pollsters quote as the ‘margin of error’. There’s also larger sampling variability that comes because polling isn’t mathematically perfect. And there are ‘house effects’, where polls from different companies have consistent differences in the medium to long term, and none of them perfectly match voting intentions as expressed at actual elections.

Most of the time, in New Zealand — when we’re not about to have an election — the only recent poll is a Roy Morgan poll, because  Roy Morgan polls more much often than anyone else.  That means the Stuff poll of polls will be dominated by the most recent Roy Morgan poll.  This would be a good idea if you thought that changes in underlying voting intention were large compared to sampling variability and house effects. If you thought sampling variability was larger, you’d want multiple polls from a single company (perhaps downweighted by time).  If you thought house effects were non-negligible, you wouldn’t want to downweight other companies’ older polls as aggressively.

Near an election, there are lots more polls, so the most recent poll from each company is likely to be recent enough to get reasonably high weight. The Stuff poll is then distinctive in that it complete drops all but the most recent poll from each company.

Recency weighting, however, isn’t at all unique to the Stuff poll of polls. For example, the poll of polls downweights older polls, but doesn’t drop the weight to zero once another poll comes out. Peter Ellis’s two summaries both downweight older polls in a more complicated and less arbitrary way; the same was true of Peter Green’s poll aggregation when he was doing it.  Curia’s average downweights even more aggressively than Stuff’s, but does not otherwise discard older polls by the same company. RadioNZ averages the only the four most recent available results (regardless of company) — they don’t do any other weighting for recency, but that’s plenty.

However, another thing recent elections have shown us is that uncertainty estimates are important: that’s what Nate Silver and almost no-one else got right in the US. The big limitation of simple, transparent poll of poll aggregators is that they say nothing useful about uncertainty.

July 29, 2017

Anything goes

According to a story in the Herald, based on what looks like it might be a bogus poll (press release), you need $5.3 million in Australia now to be considered rich.  If we assumed the number did actually measure something, how surprising would it be?

Before “Who wants to be a millionaire?” was a quiz show franchise, it was a Cole Porter song, from the  1956 movie “High Society”, so that seems a reasonable comparison period. The Australian CPI has gone up by a factor of 15.6 since 1956 (and while Australia didn’t have dollars until 1966, US and Australian dollars were roughly comparable then).

On top of pure currency conversion, though, Australia is richer now than in 1956.  Australia’s GDP in current purchasing-power adjusted dollars is nearly 8 times what it was in 1956. The population has gone from 9.4 million to 24.1 million, so real GDP per capita is up by a factor of about 3.5.

So, a 1956 million would be 15.6 current millions just from inflation, and over $50 million as a share of Australia’s economy: a millionaire in those days was not just rich, but Big Rich — as the song says: “flashy flunkies everywhere… a gigantic yacht… liveried chauffeur.”

We’re not given any real reason to believe the $5.3 million figure — there’s no reason you should rely on it more than your own guess. And ‘millionaire’ isn’t a useful comparison without a lot of additional qualification.

July 27, 2017

Will we ever use this in real life?

From deep in the archives at Language Log

The Pirahã language and culture seem to lack not only the words but also the concepts for numbers, using instead less precise terms like “small size”, “large size” and “collection”. And the Pirahã people themselves seem to be suprisingly uninterested in learning about numbers, and even actively resistant to doing so, despite the fact that in their frequent dealings with traders they have a practical need to evaluate and compare numerical expressions. A similar situation seems to obtain among some other groups in Amazonia, and a lack of indigenous words for numbers has been reported elsewhere in the world.

Many people find this hard to believe. These are simple and natural concepts, of great practical importance: how could rational people resist learning to understand and use them? I don’t know the answer. But I do know that we can investigate a strictly comparable case, equally puzzling to me, right here in the U.S. of A.

From context, you can probably guess where he’s heading

July 25, 2017

Tell them to buy an ad

From the editing blog “Heads Up”

… you don’t need a course in statistics to ask what a writer means by “incident count,” “city” and “occurrence percentage,” not to mention why and how the means are weighted, or even why users of an insurance comparison website would be a good representation of a city where a huge proportion of drivers are uninsured. 


This isn’t “fake news” in the 2016 sense; it’s the old-school kind that has always gotten past enough gatekeepers to do its work. The traditional response is “tell them to buy an ad.”


  • “Algorithms can dictate whether you get a mortgage or how much you pay for insurance. But sometimes they’re wrong – and sometimes they are designed to deceive” Cathy O’Neil, for Observer.
  • A talk about human factors research and what it says about data visualisation
  • “Point your phone at any mushroom and take a pic, our tech will instantly identify any mushrooms while giving you an article you can read or listen to.” This app seems to be intended as educational ‘augmented reality’, but one reason people want to identify mushrooms is to decide whether it’s safe to eat them. That’s not possible from just a photo, and the costs of some of the possible classification errors are very, very high.
  • A new trend in graphics: ‘joyplots’, named for the famous cover art of a Joy Division album. Here’s a history of the album cover, from Jen Christiansen. And now some examples:


July 21, 2017

This time we might be number one

So, Radio NZ has a story based on a commentary at YaleGlobal on homelessness.

The point of the YaleGlobal piece is that homelessness is increasing as the world gets more urbanised, and that it’s really hard to measure because people define it differently and because some countries don’t want it measured accurately. Overall

Based on national reports, it’s estimated that no less than 150 million people, or about 2 percent of the world’s population, are homeless. However, about 1.6 billion, more than 20 percent of the world’s population, may lack adequate housing.

There’s obviously a lot of room for variation in definitions.

This report isn’t Yale research, really. It’s based on OECD figures, which are reported by governments: the OECD HC3-1 indicator (PDF).  The number for New Zealand is 41705, which we’ve seen last year in the NZ media. It comes from the 2013 census, and was estimated by researchers at Otago.  The NZ homelessness number is high for at least three reasons.  First, NZ uses a very broad definition of homelessness. Second, we’re pretty good at honest data collection. And, third, we’ve got a serious homelessness problem (and have had for a while).

The Government is right to say that the international figures aren’t all comparable. Some countries only count people who are sleeping rough. Others include people in shelters or emergency accomodation. We include a lot more. The Herald story from last year quotes an Otago researcher, Kate Amore

“If the homeless population were a hundred people, 70 are staying with extended family or friends in severely crowded houses, 20 are in a motel, boarding house or camping ground, and 10 are living on the street, in cars, or in other improvised dwellings.”

From that tally, a few countries don’t even count all of the 10; some don’t count all of the 20; many don’t count the 70 —  and some aren’t very good at counting.

Here’s a set of charts I made based on a crude classification of definitions from the OECD HC3-1 report. The numbers on the axis are in % of the population.


Even within the top panel, NZ, the Czech Republic, and Australia have the broadest definitions. The HC3-1 report says

Australia, the Czech Republic and New Zealand report a relatively large incidence of homelessness, and this is partly explained by the fact that these countries adopt a broad definition of homelessness. In Australia people are considered as homeless if they have “no other options to acquire safe and secure housing are without shelter, in temporary accommodation, sharing accommodation with a household or living in uninhabitable housing”. In the Czech Republic the term homeless covers “persons sleeping rough (roofless), people who are not able to procure any dwelling and hence live in accommodation for the homeless, and people living in insecure accommodation and people staying in conditions which do not fulfil the minimum standards of living […]”. In New Zealand homelessness is defined as “living situations where people with no other options to acquire safe and secure housing: are without shelter, in temporary accommodation, sharing accommodation with a household or living in uninhabitable housing.”

I think the New Zealand definition is a good one for measuring the housing deprivation problem, but it’s not good for international comparisons.

On the other hand, the comparison to Australia is pretty fair, and there’s at least no evidence that anywhere else has higher rates.  To some extent we have an apples vs oranges comparison, but that doesn’t stop us concluding it’s a bad apple.

July 20, 2017

Avocado denominators

Time magazine’s website has had at least seven stories about avocado on toast since May. The most recent one says

Square, a tech company that helps businesses process credit card payments, crunched data from sellers around the U.S. and found that Americans are spending nearly $900,000 per month on crusty bread topped with mashed green fruit.

What’s impressive about that number is how unimpressive it is.  The US is a big place. Even if we only count millennials, there are 80 million of them, so we’re talking about an average of 1 cent each per month.

But it gets worse from there. The story talks about the 50-fold increase since 2014. That’s an increase in sales handled by Square, an expanding business. Neither Time nor Square seems to have made any attempt to look at sales from comparable businesses over time, which Square could have done easily.

The whole avocado-toast thing only makes sense as a synecdoche for an eating-out lifestyle, so accurate data about avocado on toast isn’t really going to be very helpful for anything important. Even so, it’s possible for innumerate presentations of inaccurate data to be less helpful.

July 3, 2017

Where science clickbait comes from

From the Chronicle of Higher Education

Mr. Vijg said repeatedly that his Nature paper made no “definitive statement” about a maximum human age and that he felt “amazement” that anyone might think otherwise. But he acknowledged approving a news release about his study issued by Albert Einstein College with the headline: “Maximum human lifespan has already been reached, Einstein researchers conclude.”

It’s more often the press release than the research paper, but with the consent of the researchers. If press releases were signed by a researcher and made available along with the research paper, there’d be more incentive not to do this sort of thing.