Posts written by Thomas Lumley (2644)

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient

January 26, 2026

Briefly

  • From the BBC “Vitamin D deficiency linked to hospital admissions”. This is from a large British study correlating vitamin D levels in the blood with hospital admissions for respiratory infections. You might say “Someone ought to do a clinical trial to see if giving people vitamin D reduces infections or if  it’s just correlation”. Someone has, here in NZ. Also, if you combine all the trials on this question you get an estimate of somewhere between a 10% reduction and a 4% increase. It’s still possible that it works in people with especially low vitamin D or something, but across a range of diseases randomised clinical trials of vitamin D have been robustly disappointing in comparison to correlational studies.
  • From Derek Lowe, a post on extremely bad clinical trial conduct. This isn’t fraud by Big Pharma — if it counts as fraud, Big Pharma is a victim (along with some, but possibly not all, of the trial participants)
  • “A very detailed map of Trump’s job approval” from Strength In Numbers (click to embiggen). “The basic idea is that we fit a model predicting approval based on demographics and geography, then use Census data to weight those predictions by the actual population composition of each area. Election results are used to calibrate estimates to sensible baselines, so we have a real-world check against our survey data. It lets us produce reliable estimates even for places where we only have a handful of direct survey respondents.”
  • From the Guardian, Australian supermarket online prices per each can be very different from the in-store per-kg prices. Most dramatically, green capsicums were allegedly 50% more expensive per each.
  • Bogus polls show increases in church attendance by young adults: Pew Research

It’s cold out there

Right now, it is very cold in the central and eastern United States.  Minneapolis has been in the news (for this as well as the bad sort of ice), but it’s not just there. The Mayor of New York has warned locals about a major snowstorm (and suggested it would be a good time to stay home and borrow the e-book or audiobook of  Heated Rivalry from the city libraries). There’s freezing weather in parts of Texas that are really not built to handle it.

Various people, as usual, have said the cold weather refutes global warming.

As you all know, the issue with global warming  is that it’s global (there’s a hint in the name) rather than local.  I always recommend looking at global temperature patterns.  Here’s the global temperature anomaly, from Climate Reanalyzer, at the University of Maine

The map is based on today’s weather forecast around the world, averaged over 24 hours and compared to the average for the same day of the year from 1979 to 2000.  As you can see, it’s unusually cold in the USA. It’s also unusually cold in a band across Asia. On the other hand, it’s unusually warm in Greenland and the far north of North America and in Siberia.

This image shows a view around the north pole

You can see here the problem is that the cold that belongs up in the Arctic has slipped down over Asia and North America.  There isn’t extra cold in the world, it’s just in unusual places.  This sort of unusual movement of north polar air is perfectly consistent with global warming models.

January 9, 2026

Baby names

The top baby names from 2025 are out, together with historic data to look at trends. Sadly, the historic data for boys’ names and for girls’ names come as three-page PDF tables with very small print, not as some conveniently computer-readable or human-readable format.  We can still see some interesting trends

The top names last year were Noah for boys (244 times) and Isla for girls (179 times). There has always been more variability in girls’ names: there are always more boys with the most popular name than girls with the most popular name.

The total number of births in NZ has been broadly stable since the 1950s but the number of babies with the most popular name has steadily been decreasing, implying increasing name diversity. In 1954 there were 1389 Johns; in 1979 there were 707 Michaels; in 2004 there were 504 Joshuas. For girls, the numbers were 779 Christines in 1954, 578 Sarahs in 1979, and 352 Emmas in 2004.

There aren’t any names that appear in the top 100 for both boys and girls. There are few names that, over history, have been in both lists but I haven’t found any that were in both lists in the same year — the closest was Kim. a top-100 name for boys in 1961 and 1962 and for girls in quite a few years starting from 1968.

January 8, 2026

Pie chart issues

This was on a real-estate agent’s advertising leaflet at a local café

If you aren’t from around here, those are neighbourhoods in south central Auckland.

Statisticians often complain about pie charts because it’s hard to make numerical comparisons between the categories, especially compared to a bar chart

The poor visual comparison might actually be a virtue in this case if the point is just that these neighbourhoods are similar.  In any case, there’s a deeper problem: pie charts are fundamentally about the relationship between portions and a total — slices and the whole pie.  In this example there is no meaningful total that the separate medians are components of.  There isn’t a pie for these to be slices of.

January 6, 2026

Vibe graphs

From Nicola Rennie on Bluesky, a bad graph found on LinkedIn:

and a correct version of the same graph

The bad version is probably from generative AI — as Nicola says, it is bad in ways that would take substantial effort to achieve in commonly-used software, ranging from the weird bar alignment to the incorrect lengths to the incoherent choice of colours to the Slovenioid flag to the spelling of Belgıun.  It’s also a bit vague about the data source, but that’s easy to achieve by hand.

The corrected version is a lot better, but brings out that this is actually hard to interpret. What’s a “foreign” language?  If you’re Welsh or Irish, can English count? Can Spanish count for the Basques? Less politically, if you’re Czech and you speak Slovak, is that a foreign language? Is it still a foreign language if you learned it before 1992?  If you grew up in Ghent, speaking Flemish and French, and learned English at school then you know one foreign language, but if you move to London do you suddenly know two foreign languages?

You might say “language that is not an official language of where you live”,  which is less ambiguous but does require identifying all the official languages of where you live. These are typically well-known within any the country or region (though there are people who profess to be confused about whether English is an official language of New Zealand), but they can be hard to determine by database search.

Kieran Healy, of Duke, gave an excellent talk last year about “Trustworthy Data Visualisation“: having graphs you can trust is not just about reproducibility in a simple sense, but about the systems that allow you to trust what you see: The important thing is not to lose sight of the collective, cooperative character of the whole enterprise.

Cross-national comparisons require that someone in each country has collected data, that the data answer the question you are interested in, that the biases and edge cases are either unimportant or the same across the countries, that the data have been accumulated, and that someone has drawn a graph.  In the past, all these steps were done by accountable people or organisations who were (or perhaps weren’t) honestly trying to provide good information. All these steps became more accessible over the past few decades, but we may be about to lose it all again.

You might well have good and sufficient reasons to trust your vibe graphics for your purposes.  It’s hard to see how other people can have good and sufficient reasons to trust them, though.

November 16, 2025

50-year mortgages and avocado toast

Sometimes society or government faces a problem where not enough money is being spent on something.  Kiwis on average aren’t allocating enough money to retirement, or councils aren’t investing enough money on water infrastructure. In that scenario, you want to spend more money.  Kiwisaver was supposed to get people to save more. Making people save more was the initial effort of the “nudge” industry. No-one seems to really know how to make councils plan for water infrastructure,  but it would be nice if we could.

The housing industry is not like that. The problem with housing prices is not that we are spending too little on houses. We (collectively) are spending too much on houses. That’s why avocado toast is not a housing issue* — if abolishing avocado toast would increase total expenditure on housing, we’d be worse off, not better off.

This week, in the US, 50-year fixed mortgages have been proposed.  A 50-year mortage would increase the amount you could spend on a house for a given level of savings and income (at the cost of dramatically reducing its value as an investment). If the problem with housing were insufficient money being spent this might help, but that’s not the problem.  A change in financing that lets people bid more for houses doesn’t help.  Like abolishing avocado toast, extending mortgage terms is trying to solve the wrong problem.  Unlike abolishing avocado toast, it might have a real effect on the market.

 

 

 

* and because the basic arithmetic doesn’t make sense, but that’s a different post

October 20, 2025

Briefly

  • It’s World Statistics Day (which only happens every five years). Well, because time zones it’s not actually World Statistics Day for another half-hour as I write this.
  • The US Secretary of Health has claimed teenage boys have sperm counts half of those in 65-year old men. Angela Rasmussen looks at this claim. If you think about it, as she points out, there’s no plausible way there could be good worldwide evidence on sperm counts in teenagers — how would you get those data?!
  • Backblaze, who sell cloud storage, have periodically reported on the long-term survival of hard drives (and released data, too) — these are the old-fashioned “spinning rust” hard drives, not solid-state drives.  Their new report says that drives are getting better, and that they don’t see the “bathtub” risk curve of folklore, where the newest and oldest drives are most likely to fail.
  • Consumer Reports has published on heavy metal content in protein powders. They say “more than two-thirds of them contain more lead in a single serving than our experts say is safe to have in a day”.  One issue here is that lead can be measured sensitively with modern technology, and is notoriously said to have no safe level, so in a sense all food will have more lead than is safe.  Consumer Reports does acknowledge this; their threshold, which one could perhaps describe as ‘not really unsafe’ is 0.5 micrograms per day.  I think it’s useful to have some historical context. In the 1980s, the “provisional tolerated weekly intake” was 5 ug/kg/week, or about 250 ug/day for a 70kg adults. For infants, even breast milk added up to 0.5ug/kg/day, well above the modern limit, and formula was much higher. So, yes, we know more about lead now and we’re right to be more scared of it, but there are a lot of people in the world who have been exposed to way more lead than these protein supplements would give you.
  • This map from USA Today is misleadingly labelled, as often happens.  It’s what I call a “caricature map”. It doesn’t show each state’s most ordered Halloween candy. It doesn’t even show each state’s most ordered Halloween candy from this specific online retailer. It shows, for each state, which candy is most over-ordered relative to the rest of the country.  Like a caricature, where you find the distinctive features of a person’s face and exaggerate them, the map finds what’s different about candy purchases in each state and promotes that to the state norm.  These maps aren’t bad — the most common candy/side dish/toy/whatever in each state is often a fairly boring map — but they would be better with an accurate description (this was from USAToday on Bluesky — the story on their website does a bit better).
September 28, 2025

Briefly

  • From the Guardian: Exclusive: Study gives 85.7% probability Badminton House version of The Lute Player is by 17th-century master. As I said about a previous rating from the same company, there’s no way this probability is meaningful to three significant digits (except potentially to the computer). The company’s head, Dr Carina Popovici, told the Guardian: “Everything over 80% is very high.” which is, um, reasonable.  Importantly, we’re not told any of the “compared to what” information. Is this 85.7% considering that it was previously described as fake and doesn’t have good provenance, or it is 85.7% if the painting was selected from a training set of half real and half fakes.  Or what?
  • From The Xylom via Flowing Data, a map of H1B visa holders at US universities, including what fraction of the research budget it would take to keep hiring at the same rate under the new rules.  I’m not sure the research budget is the right comparison — yes, a lot of H1B’s are postdoctoral researcher, but I was in a regular academic job when I had an H1B.
  • Voting has just closed in Bird of the Year, the only online bogus clicky poll endorsed by StatsChat.  Bird of the Year takes a lot more care than most online bogus polls to clamp down on virtual ballot-box stuffing. Its results are more trustworthy than the typical online clicky poll.  You should definitely be more confident that it’s identified the truely most popular bird in Aotearoa than you are that the average unweighted opt-in survey is telling you the truth.
September 23, 2025

Panadol scare

R.F. Kennedy Jr managed to predict almost perfectly the day on which his research initiative would “find”  “the” “cause” of autism.  Of course, it’s easier when you don’t have to actually do any new research.

What do we actually know about paracetamol and autism or ADHD?

About a decade ago, there was a surprise finding of a fairly weak but not negligible correlation between paracetamol use during pregnancy and ADHD symptoms in the infant.  A New Zealand study repeated this analysis and found the same answer, at which point it became a bit more interesting.  There have been other replications since then.  The correlation is reasonably well established. The problem is deciding what we can say about causation.

Clearly no-one has done a randomised trial where some pregnant people take paracetamol and others don’t, because that would be unethical and also no-one would volunteer to be in the trial.  In the absence of randomisation, the question is how comparable the paracetamol and non-paracetamol infants would be otherwise. ADHD and autism diagnosis varies in frequency by all sorts of social factors, and there’s good evidence for a genetic basis in at least some cases of autism, so comparability is not automatic.  Also, one thing we do know about all the pregnant people who took paracetamol is that they had a reason to take paracetamol (probably pain or fever).  In contrast to alcohol or tobacco,  no-one’s taking paracetamol just for fun.

So, at that point things were all a bit unclear. On the one hand, maybe you want to avoid paracetamol during pregnancy if you didn’t need it, on the other hand, you probably already were.

Last year, a very large study in Sweden reported its results. They also found a weak correlation between paracetamol use and ADHD and autism symptoms in the whole population. However, they went further than this.  They did a study restricted to comparisons between siblings.  Oversimplifying massively, you could imagine taking all the families with two children where paracetamol was used in pregnancy for just one child and not the other. You could then count up the number of families where the paracetamol-exposed infant had ADHD or autism and not the unexposed child, and vice versa.  The point is that any other factor that differs between families will be the same for the two kids in the comparison and so can’t cause a  correlation. This c0uld be a genetic factor, or some ethnic or social class difference, or access to health care, or many other things.  (My description was oversimplified in the sense that they didn’t just use families with two kids, but also those with more than two, and they adjusted for variables that they know about and are different within a family. )

Importantly, this isn’t just a case of preferring a newer study or a bigger study.  The fact that the Swedish study saw the broadly the same whole-population correlations as other research studies argues that there isn’t something different about Sweden or about their data collection. The fact that they didn’t see the same correlation when doing within-family comparisons argues that the correlation is caused by something that varies between families, not something about individual pregnancies such as paracetamol use.

Estimating rare proportions

There is a statistic circulating on social media claiming that the average person in the USA thinks 21% of the population is transgender.  Obviously this isn’t true (both obviously it isn’t 21% and obviously that isn’t what the average person believes). It’s similar in some ways to the claim that some Americans think Iran is in the middle of the Atlantic Ocean, which I’ve dealt with before, except that estimating small proportions is an extensively studied problem in psychology, so a lot is known about the biases. In fact, if you look at the original source for the claim, demonstrating this phenomenon was the actual point of the story.

As Danielle Navarro explains, all small proportions are overestimated and all large proportions underestimated when people aren’t certain of the true value. This is an extremely consistent phenomenon, to the extent that we can actually say Americans are better informed about the proportion of transgender people than they are about other comparably extreme proportions.

[Update: Andrew Gelman writes about a slightly different, but related phenomenon, in the context of people reporting having been present for mass shootings.  It’s slightly different because people are reporting their own experience, which they presumptively do know, rather than their estimates of some proportion they have no way of knowing. We’d expect the bias to be smaller in this setting, but to still be present — it’s like the estimate of the frequency of virgin birth from the National Longitudinal Study of Youth]