Posts from January 2018 (14)

January 8, 2018

Long tail of baby names

The Dept of Internal Affairs has released the most common baby names of 2017 (NZ is, I think, the first country each year to do this), and Radio NZ has a story.  A lot of names popular last year were also popular in the past; a few (eg Arlo) are changing fast.

If you look at the sixty-odd years of data available, there’s a dramatic trend. In 1954, ‘John’ was the top boy’s name, with 1389 uses. In 2017 the top was ‘Oliver’, but with only 314 uses — not enough to make 1954’s top twenty. According to the government, there were nearly 13,000 different names given last year, so the mean number of babies per name is under 5; the most popular names are still much more popular than average. But less so than in the past.

Here’s the trend in the number of babies given the top name

and the top ten names

and the top hundred names

That decrease is despite an increase in the total population: here’s the top 10 names as a percentage of all babies (assuming 53% of babies are boys)

and the top 100 names

The proportion with any of the top 100 names has been going down consistently, and also becoming less different between boys and girls.

 

Not dropping every year

Stuff has a story on road deaths, where Julie Ann Genter claims the Roads of National Significance are partly responsible for the increase in death rates. Unsurprisingly, Judith Collins disagrees.  The story goes on to say (it’s not clear if this is supposed to be indirect quotation from Judith Collins)

From a purely statistical viewpoint the road toll is lowering – for every 10,000 cars on the road, the number of deaths is dropping every year.

From a purely statistical viewpoint, this doesn’t seem to be true. The Ministry of Transport provides tables that show a rate of fatalities per 10,000 registered vehicles of 0.077 in 2013, 0.086 in 2014,  0.091 in 2015, and  0.090 in 2016. Here’s a graph, first raw

and now with a fitted trend (on a log scale, since the trend is straighter that way)

Now, it’s possible there’s some other way of defining the rate that doesn’t show it going up each year. And there’s a question of random variation as always. But if you scale for vehicles actually on the road, by using total distance travelled, we saw last year that there’s pretty convincing evidence of an increase in the underlying rate, over and above random variation.

The story goes on to say “But Genter is not buying into the statistics.” If she’s planning to make the roads safer, I hope that isn’t true.

Briefly

  • “Every now and then a story appears in the media about how boffins (and it is always “boffins”) have worked out an equation for something: the perfect cup of tea, the most depressing day of the year, the best way to make pancakes, the perfect handshake, or in the most recent case, the perfect cheese on toast.” The equation for the perfect bullshit equation.
  • The BBC’s statistics-in-the-media radio program More or Less has a special ‘statistics of the year’ episode
  • Some interesting student projects from a data visualisation class
  • How Spotify picks your music.
  • “Average London”: averages of tourist photos of the same London attraction.
  • Displaying uncertainty in the UK unemployment rate
  • One of the problems in training modern neural network classifiers is that they will pick up on anything, sensible or not. Luke Oakden-Rayner writes about a popular set of data from chest x-rays and why it won’t teach the computers the right things.
  • The American Academy of Family Physicians is not endorsing new blood pressure standards that would increase the proportion of US adults defined as having hypertension from about 1/3 to about 1/2.
January 2, 2018

Consider a spherical cow

Part of the point of mathematical modelling is discarding unimportant features of a problem to make it tractable. But you have to discard the right features. Here are two recent stories about mathematical optimisation.

Jason Steffen has invented a more efficient way of getting passengers on to planes — not just more efficient than what US airlines actually do, but even more efficient than letting passengers board at random. He writes

So, why isn’t this optimum method of airplane boarding being adopted by any carrier in the industry? One significant reason may be the challenge of its implementation — lining passengers up in such a rigid order. 

I’d argue that a much more important reason is you’d have to get rid of priority boarding, making frequent flyers queue with everyone else and depriving them of their chance to get the lion’s share of overhead luggage space.  A model that doesn’t account for the power of frequent flyers is solving the wrong optimisation problem to get implemented.

The same sort of issue often turns up in US discussions of partisan gerrymandering, where you’ll see mathematicians write about algorithms for perfect electorate design. These don’t solve an existing problem, because they don’t take into account who actually draws districts: it isn’t impartial mathematicians.  The main theoretical limitation on gerrymandering in the US is the power of courts to declare a partisan redistricting plan unconstitutional — but they aren’t willing to do so.  Justice Scalia wrote in 2004

    Eighteen years of judicial effort with virtually nothing to show for it justify us in revisiting the question whether the standard promised by Bandemer exists. As the following discussion reveals, no judicially discernible and manageable standards for adjudicating political gerrymandering claims have emerged. Lacking them, we must conclude that political gerrymandering claims are nonjusticiable and that Bandemer was wrongly decided.

There’s a new effort to change this, from Wisconsin. In 2015, the state was sued in federal District Court over its redistricting plan, and lost. The case focused on the ‘efficiency gap’; the difference in the number of ‘wasted’ votes between the two parties (as a percentage of all votes cast). The Supreme Court has heard an appeal in October this year and is thinking about it.

Patrick Honner wrote about the efficiency-gap proposal for Quanta, but there’s a lot more detail in a 2015 expert-witness report by Simon Jackman (PDF), an Australian political scientist at Stanford.