Posts filed under Denominator? (83)

September 19, 2017

Denominators and BIGNUMs


It’s pretty obvious that Bon Appétit has just confused averages and totals here.

So, what is the average? There were about 75 million millennials in the US in 2016 (we can probably assume  Bon Appétit doesn’t care about other countries), so we’re looking at $1280/year, or about $25/week. Which actually seems pretty low as an average.  The US as a whole spent $1.46 trillion on food and beverages in 2014, which is about $4500/person/year or about $87/week.

As with so much generation-mongering, asking about the facts is missing the intended purpose of the story, which is to recycle some stereotypes about lazy/wasteful youth.

The story links to another, about a new book “Generation Yum”

Turow characterized the quintessential Millennial experience this way: “You got into a top tier high school, you hustled through college—you’ve done everything society told you—and you’re not rewarded. 

When “get into a top-tier high school” is a quintessential generational experience it’s clear we’re not even trying to go beyond unrepresentative stereotypes.  In which case, hold the numbers.

August 19, 2017

Sampling bias

Via GeoNet, a magnitude 4.5 quake south of Dannevirke (blue box)


The squares are reports of shaking. The big cluster is Palmerston North, with secondary clusters in Feilding and Ashhurst: there are more people who felt the quake there because there are more people there.  See also XKCD

August 11, 2017

Different sorts of graphs

This bar chart from Figure.NZ was in Stuff today, with the lead

Working-age people receiving benefits are mostly in the prime of our working life – the ages of 25 to 54.


The numbers are correct, but the extent to which the graph fits the story is a bit misleading.  The main reason the two bars in the middle are higher is that they are 15-year age groups, when the first bar is a 7-year group and the last is a ten-year group.

Another way to show the data is to scale the bar widths proportional to the number of years and then scale the height so that the bar area matches the count of people. The bar height is now counts of people per year of age


This is harder to read for people who aren’t used to it, but arguably more informative. It suggests the 25-54 year groups may be the largest just because the groups are wider.

We really need population size data, since the number of people in NZ also varies by age group.  Showing the percentage receiving benefits in each age group gives a different picture again


It looks as though

  • “working age” people 25-39 and 40-54 make up a larger fraction of those receiving benefits than people 18-24 or 55-64
  • a person receiving benefits is more likely to be, say, 20 or 60 than 35 or 45.
  • the proportion of people receiving benefits increases with age

These can all be true; they’re subtly different questions. Part of the job of a statistician is to help you think about which one you wanted to ask.

August 8, 2017

Breast cancer alcohol twitter

Twitter is not an ideal format for science communication, because of the 140-character limitations: it’s easy to inadvertently leave something out.  Here’s one I was referred to this morning (link, so you can see if it is retracted)


Usually I’d think it was a bit unfair to go after this sort of thing on StatsChat.  The reason I’m making an exception here is the hashtag: this is a political statement by a person of mana.

There’s one gross inaccuracy (which I missed on first reading) and one sub-optimal presentation of risk.  To start off, though, there’s nothing wrong with the underlying number: unlike many of its ilk it isn’t an extrapolation from high levels of drinking and it isn’t obviously confounded, because moderate drinkers are otherwise in better health than non-drinkers on average.  The underlying number is that for each standard drink per day, the rate of breast cancer increases by a factor of about 1.1.

The gross inaccuracy is the lack of a per day qualifier, making the statement inaccurate by a factor of several thousand.  An average of one standard drink per day is not a huge amount, but it’s probably more than the average for women in NZ (given the  2007/08 New Zealand Alcohol and Drug Use Survey finding that about half of women drank alcohol less than weekly).

Relative rates are what the research produces, but people tend to think in absolute risks, despite the explicit “relative risk” in the tweet.  The rate of breast cancer in middle age (what the data are about) is fairly low. The lifetime risk for a 45 year old woman (if you don’t die of anything else before age 90) is about 12%.  A 10% increase in that is 13.2%, not 22%. It would take about 7 drinks per day to roughly double your risk (1.17=1.94)  — and you’d have other problems as well as breast cancer risk.


July 29, 2017

Anything goes

According to a story in the Herald, based on what looks like it might be a bogus poll (press release), you need $5.3 million in Australia now to be considered rich.  If we assumed the number did actually measure something, how surprising would it be?

Before “Who wants to be a millionaire?” was a quiz show franchise, it was a Cole Porter song, from the  1956 movie “High Society”, so that seems a reasonable comparison period. The Australian CPI has gone up by a factor of 15.6 since 1956 (and while Australia didn’t have dollars until 1966, US and Australian dollars were roughly comparable then).

On top of pure currency conversion, though, Australia is richer now than in 1956.  Australia’s GDP in current purchasing-power adjusted dollars is nearly 8 times what it was in 1956. The population has gone from 9.4 million to 24.1 million, so real GDP per capita is up by a factor of about 3.5.

So, a 1956 million would be 15.6 current millions just from inflation, and over $50 million as a share of Australia’s economy: a millionaire in those days was not just rich, but Big Rich — as the song says: “flashy flunkies everywhere… a gigantic yacht… liveried chauffeur.”

We’re not given any real reason to believe the $5.3 million figure — there’s no reason you should rely on it more than your own guess. And ‘millionaire’ isn’t a useful comparison without a lot of additional qualification.

January 11, 2017

If you’re a house

From the Herald

Nationwide 63.2 per cent of people today live in their own home – the lowest rate since the 61.2 per cent recorded at the 1951 Census – whereas 33 per cent live in a rental.

From Newstalk ZB

A shade over 63 percent of people today are living in their own home. 

That’s the lowest rate since 1951 when it was 61 percent.

From Newshub

Dwelling and household estimates data released on Tuesday shows that as of December 2016, 63.2 percent of people live in their own home.

One News don’t have text up yet, but their story has the same claim.

As David Welch points out in a stat-of-the-week nomination, that’s not what the number means: 63.2% is the percentage of homes occupied by at least one of their owners.  It’s the home ownership rate if you’re a house, rather than if you’re a person.

The proportion of people living in those households isn’t easy to work out — on one hand, single-person households tend to be renters; on the other hand, overcrowded households are often renters too.  StatsNZ does provide the proportion of individuals who own their home, which is rather lower, at about 50%. But that’s not the number the news stories want, either.  That’s the proportion of people 20 and older who, personally, own or part-own their homes. Living in a home owned by your parents, or your partner, or your child, doesn’t count.

That last sentence also illustrates why ‘home ownership’ is harder to define than you might think, just like unemployment.  Should a 22-year-old living with parents count towards home ownership? If not, should they count in the denominator as not home ownership, or should we just be looking at owning vs renting? How about an elderly person living with one of their children?

It would be helpful if the proportion of people living in owner-occupied households was published regularly, but it wouldn’t answer all the questions.  As an easier step, it would also be useful if the media accurately described the number they used.

December 2, 2016

Crash statistics

From the Herald

Obviously there isn’t research giving ‘the exact time you will crash your car’.  What you might hope for is the time at which you (more precisely, the average NZ driver) are at highest risk.  We don’t even get that.

The comparisons are for totals, and as the story admits, more crashes happen in peak times because more people are driving.  It’s worse than that, though. The story says

…22,000 collisions occur annually in the afternoon peak up to 6pm. This then drops to just 2000 crashes a year at 11pm and a mere 800 at 1am.

The 22,000 is over 3-hour periods and I think the 2000 and 800 are for single-hour periods — I can’t tell for sure, because there’s no link to the original source, and I can’t find it on the IAG website.

Perhaps more relevantly for the New Zealand Herald, you have to read down to paragraph 11, which begins “Across most states…” to get the first solid indication that this story is about another country.

It’s from, which explains why the handling of numbers isn’t up to local standards.

November 26, 2016

Garbage numbers from a high-level source

The World Economic Forum (the people who run the Davos meetings) are circulating this graph:cyjjcamusaaooga

According to the graph, New Zealand is at the bottom of the OECD, with 0% waste composted or recycled.  We’ve seen this graph before, with a different colour scheme. The figure for NZ is, of course, utterly bogus.

The only figure the OECD report had on New Zealand was for landfill waste, so obviously landfill waste was 100% of that figure, and other sources were 0%.   If that’s the data you have available, NZ should just be left out of the graph — and one might have hoped the World Economic Forum had enough basic cluefulness to do so.

A more interesting question is what the denominator should be. The definition the OECD was going for was all waste sent for disposal from homes and from small businesses that used the same disposal systems as homes. That’s a reasonable compromise, but it’s not ideal. For example, it excludes composting at home. It also counts reuse and reduced use of recyclable or compostable materials as bad rather than good.

But if we’re trying to approximate the OECD definition, roughly where should NZ be?  I can’t find figures for the whole country, but there’s some relevant –if outdated — information in Chapter 3 of the Waste Assessement for the Auckland Council Waste Management Plan. If you count just kerbside recycling pickup as a fraction of kerbside recycling+waste pickup, the diversion figure is 35%. That doesn’t count composting, and it’s from 2007-8, so it’s an underestimate. Based on this, NZ is probably between USA and Australia on the graph.

June 13, 2016

Reasonable grounds

Mark Hanna submitted an OIA request about strip searches in NZ prisons, which carried out with ‘reasonable grounds to believe’ the prisoner has an unauthorised item.  You can see the full response at FYI. He commented that 99.3% of these searches find nothing.

Here’s the monthly data over time:

The positive predictive value of having ‘reasonable grounds’  is increasing, and is up to about 1.5% now. That’s still pretty low. How ‘reasonable’ it is depends on what proportion of the time people who aren’t searched have unauthorised items: if that were, say, 1 in 1000, having ‘reasonable grounds’ would be increasing it 5-15-fold, which might conceivably count as reasonable.

We can look at the number of searches conducted, to see if that tells us anything about trends
Again, there’s a little good news: the number of strip searches has fallen over the the past couple of years. That’s a real rise and fall — the prison population has been much more stable. The trend looks very much like the first trend upside down.

Here’s the trend for number (not proportion) of searches finding something
It’s pretty much constant over time.

Statistical models confirm what the pictures suggest: the number of successful searches is essentially uncorrelated with the total number of searches. This is also basically good news (for the future, if not the past): it suggests that a further reduction in strip searches may well be possible at no extra risk.

May 29, 2016

I’ma let you finish

Adam Feldman runs the blog Empirical SCOTUS, with analyses of data on the Supreme Court of the United States. He has a recent post (via Mother Jones) showing how often each judge was interrupted by other judges last year:


For those of you who don’t follow this in detail, Elena Kagan and Sonia Sotomayor are women.

Looking at the other end of the graph, though, shows something that hasn’t been taken into account. Clarence Thomas wasn’t interrupted at all. That’s not primarily because he’s a man; it’s primarily because he almost never says anything.

Interpreting the interruptions really needs some denominator. Fortunately, we have denominators. Adam Feldman wrote another post about them.

Here’s the number interruptions per 1000 words, with the judges sorted in order of  how much they speak


And here’s the same thing with interruption per 100 ‘utterances’


It’s still pretty clear that the female judges are interrupted more often (yes, this is statistically significant (though not very)). Taking the amount of speech into account makes the differences smaller, but, interestingly, also shows that Ruth Bader Ginsburg is interrupted relatively often.

Denominators do matter.