Posts written by Thomas Lumley (1636)


Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

November 16, 2015

Measuring gender

So, since we’re having a Transgender Week of Awareness at the moment, it seems like a good time to look at how statisticians ask people about gender, and why it’s harder than it looks.

By ‘harder than it looks’ I don’t just mean that it isn’t a binary question; we’re past that stage, I hope.  Also, this isn’t about biological sex — in genetics I do sometimes care how many X chromosomes someone has, but most questionnaires don’t need to know. It’s harder than it looks because there isn’t just one question.

The basic Male/Female binary question can be extended in (at least) two directions.  The first is to add categories to represent other ways people identify their gender beyond just male/female, which can be fluid over time, or can have more than two categories. Here a write-in option is useful since you almost certainly don’t know all the distinctions people care about across different cultures. In a specialised questionnaire you might even want to separate out questions about fluid/constant identity from non-binary/diversity, but for routine use that might be more than you need.

A second direction is to ask about transgender status, which is relevant for discrimination and (or thus) for some physical and mental health risks.  (Here you might want also want to find out about people who, say, identify as female but present as male.) We have very little idea how many people are transgender — it makes data on sexual orientation look really precise — and that’s a problem for service provision and in many other areas.

Life would get simpler for survey collectors if you combined these into a single question, or if you had a Male/Female/It’s Complicated question with follow-up questions for the third group. On the other hand, it’s pretty clear why trans people don’t like that approach. These really are different questions. For people whose answer to the first question is something like “it depends” or a culturally specific third option, the combination may not be too bad. The problem comes when answer to the second type of question might be “Trans (and yes I sometimes get comments behind my back at work but most people are fine)”, but the answer to the first “Female (and just as female as people with ovaries and a birth certificate, ok)”.

Earlier this year Stats New Zealand ran a discussion and  had a go at a better gender question, and it is definitely better than the old one, especially when it allows for multiple answers and for a write-in answer. They also have a ‘synonym list’ to help people work with free-text answers, although that’s going to be limited if all it does is map back to binary or three-way groups. What they didn’t do was to ask for different types of information separately. [edit: ie, they won’t let you unambiguously say ‘female’ in an identity question then ‘trans’ in a different question]

It’s true that for a lot of purposes you don’t need all this information. But then, for a lot of purposes you don’t actually need to know anything about gender.

(via Writehanded and Jennifer Katherine Shields)

November 15, 2015

Out of how many?

Stuff has a story under the headline ACC statistics show New Zealand’s riskiest industries. They don’t. They show the industries with the largest numbers of claims.

To see why that’s a problem, consider instead the number of claims by broad ethnicity grouping: 135000 for European, 23100 for Māori, 10800 for Pacific peoples(via StatsNZ). There’s no way that European ethnicity gives you a hugely greater risk of occupational injury than Māori or Pacific workers have. The difference between these groups is basically just population size. The true risks go in the opposite direction: 89 claims per 1000 full-time equivalent workers of European ethnicities, 97 for Māori, and 106 for Pacific.

With just the total claims we can’t tell whether working in supermarkets and grocery stores is really much more dangerous than logging, as the story suggests. I’m dubious, but.

November 13, 2015

Blood pressure experiments

The two major US medical journals each published  a report this week about an experiment on healthy humans involving blood pressure.

One of these was a serious multi-year, multi-million-dollar clinical trial in over 9000 people, trying to refine the treatment of high blood pressure. The other looks like a borderline-ethical publicity stunt.  Guess which one ended up in Stuff.

In the experiment, 25 people were given an energy drink

We hypothesized that drinking a commercially available energy drink compared with a placebo drink increases blood pressure and heart rate in healthy adults at rest and in response to mental and physical stress (primary outcomes). Furthermore, we hypothesized that these hemodynamic changes are associated with sympathetic activation, which could predispose to increased cardiovascular risk (secondary outcomes).

The result was that consuming caffeine made blood pressure and heart rate go up for a short period,  and that levels of the hormone norepinephrine  in the blood also went up. Oh, and that consuming caffeine led to more caffeine in the bloodstream than consuming no caffeine.

The findings about blood pressure, heart rate, and norepinephrine are about as surprising as the finding about caffeine in the blood. If you do a Google search on “caffeine blood pressure”, the recommendation box at the top of the results is advice from the Mayo Clinic. It begins

Caffeine can cause a short, but dramatic increase in your blood pressure, even if you don’t have high blood pressure.

The Mayo Clinic, incidentally, is where the new experiment was done.

I looked at the PubMed research database for research on caffeine and blood pressure.  The oldest paper in English for which I could get full text was from 1981. It begins

Acute caffeine in subjects who do not normally ingest methylxanthines leads to increases in blood pressure, heart rate, plasma epinephrine, plasma norepinephrine, plasma renin activity, and urinary catecholamines.

This wasn’t news already in 1981.

Now, I don’t actually like energy drinks; I prefer my caffeine hot and bitter.  Since many energy drinks have as much caffeine as good coffee and some have almost as much sugar as apple juice, there’s probably some unsafe level of consumption, especially for kids.

What I don’t like is dressing this up as new science. The acute effects of caffeine on the cardiovascular system have been known for a long time. It seems strange to do a new human experiment just to demonstrate them again. In particular, it seems ethically dubious if you think these effects are dangerous enough to put out a press release about.


Flag text analysis

The group in charge of the flag candidate selection put out a summary of public responses in the form of a word cloud. Today in Insights at the Herald there’s a more accurate word cloud using phrases as well as single words and not throwing out all the negative responses


There’s also some more sophisticated text analysis of the responses, showing what phrases and groups of ideas were common, and an accompanying story by Matt Nippert

Suzanne Stephenson, head of communications for the flag panel, rejected any suggestion of spin and said the wordcloud was never claimed as “statistically significant”.

“I think people misunderstood it as a polling exercise.”

“Statistically significant” is irrelevant misuse of technical jargon. The only use for a word cloud is to show which words are more common. If that wasn’t what the panel wanted to do, they shouldn’t have done it.



Drug subsidy arithmetic

There are new, very promising treatments for some cancers, which work by disabling one of the safety mechanisms in the immune system so that it can attack the tumour. These treatments have been approved in the US for melanoma and the most common type of lung cancer, and they look better than anything we’ve seen before.  The problem is the price, more than NZ$200,000.

In New Zealand, roughly 2000 people die of melanoma or lung cancer each year. At the current market prices, a course of treatment for each of them would absorb more than half of Pharmac’s budget.

It’s not even as if the $200,000 guarantees a cure. The results of the KEYNOTE trial in melanoma were impressively good by the standard of melanoma treatment, but:

The overall response rate was 34% and the complete response rate was 6%. Eighty percent of responses were still ongoing at the time of analysis, and the median duration of response has not yet been reached.

There really are people whose disease completely vanished, but only about one in sixteen. Two thirds didn’t see any response. Even for the people who have no detectable disease we can’t yet know if the benefits will last a few years or a lifetime.

Pharmac is not going to fund these treatments at anything like the current price in the current budget. That’s not a matter of debate or public pressure. It’s just not happening. It won’t add up. Conceivably, the government could decide to come up with the money to fund the treatments outside the current Pharmac budget. I don’t think that would be the best way to spend the money, but I’m glad to say this is the sort of decision I don’t get to make.

Over the next few years, other companies will introduce treatments that attack the same or related immune checkpoint targets, and competition will make the price fall. At some point, it will be worthwhile for drug companies to make Pharmac an offer it can accept, as happened recently with Humira, Pharmac’s current top spend (which turns off the immune response in a somewhat similar way to how the new cancer treatments turn it on).

Five years ago, you couldn’t get these treatments if you were a billionaire. In ten years or so, I expect these or similar treatments will be effectively free to New Zealanders. At the moment, we’re in the painful transition period, where the manufacturers can afford to target only the wealthiest individuals and insurance companies.

November 10, 2015

Unwise baby name claims

You’ve probably seen this map from Reddit: more people live inside the circle than outside it


Another map, at Stuff, claims to show the countries where “Sofia” and its variants ranks high as a name for new babies


Based on these rankings, we get

“The numbers have been crunched and the results are in. Forget John, Mohammed, Charlotte or Olivia: the most popular baby name in the world right now is Sofia.”

You can see from the map that more than half the world’s population is in countries labelled “No Data”. In fact, more than half the world’s population, plus Brazil and all of Africa. “Sofia” is the most popular baby name the way L&P is world famous in New Zealand.

But that’s not the worst bit. The first line of the story said “Forget John, Mohammed, Charlotte or Olivia”. The statistics on the map and on the linked website are for Sofia as a popular name for girls. Boys’ names aren’t in the comparison — Stuff did just ‘forget’ John and Mohammed.

You’ve got to respect Laura Wattenberg of BabyNameWizard, who does a great job getting her website into the news. Sites that take this sort of story and exaggerate it into obviously unfounded headline news, maybe not so much respect.


New blood pressure trial

A big randomised trial comparing strategies for treating high blood pressure has just ended early (paper, paywalled).  There’s good coverage in the New York Times, and there will probably be a lot more over the next week. It’s a relatively complicated story.

The main points:

  • Traditionally, doctors try to get your blood pressure below 140mmHg, but some people always thought lower would be better.
  • The study, funded by the US government, randomly allocated over 9000 people with high blood pressure and some other heart disease risk factor (but not diabetes) to either try to get blood pressure of 140mmHg or try to get 120mmHg.
  • A previous trial with the same targets, but in people with diabetes, had been unimpressive: the results slightly favoured more-intensive treatment, but the difference was small, and well within the variation you’d expect by chance.
  • In the new trial blood pressure targeting worked really well: the average blood pressure in the low group was 122mmHg, and in the normal group was 135.
  • Typically, people in the low group took two or three blood pressure medications, those in the normal group typically took one or two — but in both cases with quite a lot of variation.
  • There were 76 fewer ‘primary outcome events’:  heart attack, stroke, heart failure, or death from heart disease in the low BP group, and 55 fewer deaths from any cause.
  • From the beginning, the plan was to stop whenever the difference in number of ‘primary outcome events’ exceeded a specified threshold, unless there was a good reason based on the data to continue. The difference had been just barely over the threshold at the previous analysis, and they continued. In mid-September it was clearly over the threshold, and they stopped.
  • Stopping early will tend to overestimate the benefit, but the fact that they waited for one more analysis reduces this bias.

I’m surprised the benefit from extreme blood pressure reduction is so large (in a relative sense), but even more surprised that they managed to get so many healthy people to take their treatments that consistently for over three years.  As context for this, data from a US national survey in 2011-12 showed only about two-thirds of those currently taking medications for high blood pressure even get down to 140mmHg.

In an absolute sense the risk reduction is relatively small: for every thousand people on intensive blood pressure reduction — healthy people taking multiple pills, multiple times per day — they saw 12 fewer deaths and 16 fewer ‘events’.   On the other hand, the treatments are cheap and most people can find a combination without much in the way of side effects. If intensive treatment becomes standard, there will probably be more use of combination pills to make multiple drugs easier to take.

There’s one moderately worrying factor: a higher rate of kidney impairment in the low BP group (higher by a couple of percentage points). The researchers indicate that they don’t know if this is real, permanent  damage, and that more follow-up and testing of those people is needed. If it is a real problem it could be more serious in ordinary medical practice than in the obsessively-monitored trial.  This may well explain why the trial didn’t stop even earlier:  the monitoring committee would have wanted to be sure the benefits were real given the possibility of adverse effects — the sort of difficult decision that is why you have experienced, independent monitoring committees. 


  • One of the problems with reducing things to hypothesis testing is that you often don’t want to make a one-off decision. A comic from Saturday Morning Breakfast Cereal makes this point, but in a case where you really do want to make a one-off decision.
  • From StuffMost women are either lesbian or bisexual’ but never straight, study claims”. From the Daily Beast: “Um, yes, straight women are real”
  • Digit preference in US football “The only explanation that has stuck with me is that when a official thinks ‘oh damn this is a mess, there are seven separate six foot tall millionaires all piled up on top of the ball and I have 100 rules to try and remember, where did that ball stop?’  their subconscious makes them grab for the safety blanket of a line drawn and place it down on there.”
  • Digit preference in marathons: People really prefer to run 3:59 rather than 4:00 (early last year, from some economists at Chicago(PDF), but you probably saw the New York Times version)
  • A useful bit of arithmetic for “Why isn’t this medication free?” stories. Pharmac’s budget is a little under $800 million per year.  There are about 4.5 million people in New Zealand, so that’s under $200 per person per year, or under $16,000 per person per expected lifetime. Based on more accurate inputs, it’s about $14,000 per person per lifetime.
  • In the US, mortality isn’t falling for 45-54 year olds identifying as white the way it is for basically everyone else in the West. If you read lots of statistics blogs, you will have seen discussion about whether mortality in this group is really rising or not: the peak of the baby boom just swept through the 45-54 band, so the average age of people in this group has increased. That’s worth looking at, but doesn’t change the basic message.
November 9, 2015

Inelegant variation

These graphs are from the (US) National Cable & Telecommunications Association (the cable guys)


Apart from the first graph, they are based on five-point agree-disagree scales, and show the many ways you can make pie and bar charts more interesting, especially if you don’t care much about the data. I think my favourites are the bendy green barchart-orbiting-a-black-hole and the green rectangles, where the bars disagree with the printed numbers.

Since it’s a bogus poll, using the results basically to generate artwork is probably the right approach.

Fish and chips might be bad for you

From the Herald (from the Telegraph)

Martin Grootveld, a professor of bioanalytical chemistry and chemical pathology, said his research showed “a typical meal of fish and chips”, fried in vegetable oil, contained as much as 100 to 200 times more toxic aldehydes than the safe daily limit set by the World Health Organisation.

In contrast, heating up butter, olive oil and lard in tests produced much lower levels of aldehydes. Coconut oil produced the lowest levels of the harmful chemicals.


That’s in the lab. In July, Professor Grootveld reported the same type of analysis for a BBC program, but on oil as actually used by home cooks. From the press release at De Montfort University

Professor Grootveld’s team found sunflower oil and corn oil produced aldehydes at levels 20 times higher than recommended by the World Health Organisation. 

Olive oil and rapeseed oil produced far fewer aldehydes as did butter and goose fat.

So, about an order of magnitude less bad than the current story.

The story talks about turning current food advice on its head. The most… the two most… among the several most important things wrong with that claim are: first, that oils high in monounsaturated fats (such as olive oil and rapeseed/canola) are the current food advice; second, that the advice to eat less saturated fat is based on studies of actual disease, not just on lab biochemistry;  third, Prof Grootveld published research on this lipid oxidation phenomenon in 1998, so his reported surprise at the findings is a bit strange; and fourth, “a typical meal of fish and chips” hasn’t been regarded as health food since basically forever.