Stats Chat Stats Chat

April 10, 2015

Briefly

By Thomas Lumley

A properly-conducted opinion poll in Cuba, done in secret. Impressive.

As the Herald reports, New Zealand moved from 1st to 5th on the index reported by Social Progress Imperative. The story also points out, helpfully, that a lot of this is changes in how things are measured. It turns out this goes further: a 2014 version of the index is available using the new measurements. When the same definitions are used for the two years, NZ stays at the same ranking (5th) and improves on the actual values (from 86.93 to 87.08).

JPMorgan is using workplace data to predict which employees are likely to ‘go rogue’. Matt Levine doesn’t really worry. The Bloomberg News story worries a bit, but only “Policing intentions can be a slippery slope. Do people get a scarlet letter for something they have yet to do?” They don’t seem to consider false positives: people who weren’t going to do anything wrong (or more wrong than is necessary if you work for an investment bank).

The NZ Association of Scientists is having a conference titled “Speaking Out: Going public on difficult issues”. There will probably be more stuff on line soon, but currently you can read an expanded version of Peter Gluckman’s talk, and listen to (NZAS President) Nicola Gaston on Radio NZ; the Twitter hashtag is #GoingPublic

Odds and probabilities

By Thomas Lumley

When quoting results of medical research there’s often confusion between odds and probabilities, but there are stories in the Herald and Stuff at the moment that illustrate the difference.

As you know (unless you’ve been on Mars with your eyes shut and your fingers in your ears), Jeremy Clarkson will no longer be presenting Top Gear, and the world is waiting with bated breath to hear about his successor. Coral, a British firm of bookmakers, say that Sue Perkins is the current favourite.

The Herald quotes the Daily Mail, and so gives the odds as odds:

It has made her evens for the role, ahead of former X-factor presenter Dermot O’Leary who is 2-1 and British model Jodie Kidd who is third at 5-2.

Stuff translates these into NZ gambling terms, quoting the dividend, which is the reciprocal of the probability at which these would be regarded as fair bets

Bookmaker Coral have Perkins as the equivalent of a $2 favourite after a flurry of bets, while British-Irish presenter Dermot O’Leary was at $3 and television personality and fashion model Jodie Kidd at $3.50.

An odds of 5-2 means that betting £2 and winning gives you a profit of £5. The NZ approach is to quote the total money you get back: a bet of $2 gets you $2 back plus $5 profit, for a total of $7, so a bet of $1 would get you $3.50.

The fair probability of winning for an odds of 5-2 is 2/(5+2); the fair probability for a dividend of $3.50 is 1/3.50, the same number.

Of course, if these were fair bets the bookies would go out of business: the actual implied probability for Jodie Kidd is lower than 1/3.5 and the actual implied probability for Sue Perkins is lower than 0.5. On top of that, there is no guarantee the betting public is well calibrated on this issue.

April 9, 2015

Graph of the week

By Thomas Lumley

Number of learner license tests taken in New Zealand, according to One News.

We’ll follow up to see if the future prediction part of the graph turns out to be correct.

View comments (9)

Height and heart attack: genetic determinism is still wrong

By Thomas Lumley

From the Herald (originally from the Independent)

Short people are at a greater risk of heart attack – and there’s little they can do about it because the link is genetic.

This one is partly the fault of the researchers and partly the fault of the journalists. The press release says

“We have shown that the association between shorter height and higher risk of coronary heart disease is a primary relationship and is not due to confounding factors such as nutrition or poor socioeconomic conditions.”

That’s partly true, and new and interesting, but (a) it’s being oversold (“the” association?) and (b) even if it were completely true, it wouldn’t imply the “there’s little they can do about it” added by the journalists.

Taking the second point first: knowing that something has a genetic component tells you absolutely nothing about how easy or hard it is to change. At a biological level hair colour and eye colour have similar degrees of genetic influence, but one of them is very easy to change and the other is more difficult and inconvenient.

Also, it’s certainly not true that height is entirely genetically determined. There is a genetic component: tall people have tall children. There is also an environmental component: most people are taller than their grandparents. Here’s a graph (source) showing how the heights of Dutch people changed over sixty years: the Dutch went from some of the shortest people in Europe to some of the tallest, and this was an environmental change, not a genetic change.

The research paper doesn’t even claim that among modern Westerners the association between height and heart attack risk is all genetic, though if you only have the press release you have to read carefully to avoid getting that impression. Even within the (fairly homogeneous) groups of people being studied, the genetic variants they used explain only about 10% of the variation in height.

What’s new in this research is that some of the relationship between height and heart attack risk is genetic. Until now, it was possible that all the association was explained by environmental factors in childhood or before birth that made people shorter and also, separately, increased their heart attack risk.

For the part of the relationship explained by genetic variation there are basically three possible sorts of explanation:

Being short has some direct biological effect on risk, for example, smaller people have smaller blood vessels, which might get blocked by smaller blood clots.
Being short subjects you to different environmental risks: for example, if shorter people had lower incomes (on average) they might have higher risk for various social and lifestyle reasons
The genetic variants that make you shorter also have some separate effect on heart attack risk: for example, the same variant might affect growth in infancy and also affect diabetes risk in later life.

These are all interesting, and there’s a reasonable hope of being able to separate them out with more data and experiments.

The last sentence of the research paper is a good counterpoint to the media coverage

More generally, our findings underscore the complexity underlying the inherited component of CAD.

[Disclosure: I work with one of the cohorts that is part of one of the consortia that is part of the whole Cardiogram group and I know some of the researchers — but that would be true of anyone in the field]

April 8, 2015

NRL Predictions for Round 6

By David Scott

Team Ratings for Round 6

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

	Current Rating	Rating at Season Start	Difference
Rabbitohs	11.30	13.06	-1.80
Roosters	9.06	9.09	-0.00
Cowboys	6.51	9.52	-3.00
Storm	5.32	4.36	1.00
Broncos	4.51	4.03	0.50
Panthers	2.88	3.69	-0.80
Bulldogs	1.52	0.21	1.30
Warriors	1.47	3.07	-1.60
Knights	0.29	-0.28	0.60
Dragons	-1.66	-1.74	0.10
Sea Eagles	-2.21	2.68	-4.90
Eels	-5.31	-7.19	1.90
Raiders	-6.34	-7.09	0.70
Wests Tigers	-7.54	-13.13	5.60
Sharks	-8.87	-10.76	1.90
Titans	-9.60	-8.20	-1.40

Performance So Far

So far there have been 40 matches played, 22 of which were correctly predicted, a success rate of 55%.

Here are the predictions for last week’s games.

	Game	Date	Score	Prediction	Correct
1	Bulldogs vs. Rabbitohs	Apr 03	17 – 18	-7.80	TRUE
2	Titans vs. Broncos	Apr 03	16 – 26	-11.30	TRUE
3	Knights vs. Dragons	Apr 04	0 – 13	7.80	FALSE
4	Sea Eagles vs. Raiders	Apr 04	16 – 29	10.30	FALSE
5	Roosters vs. Sharks	Apr 05	12 – 20	25.40	FALSE
6	Eels vs. Wests Tigers	Apr 06	6 – 22	8.60	FALSE
7	Panthers vs. Cowboys	Apr 06	10 – 30	2.40	FALSE
8	Storm vs. Warriors	Apr 06	30 – 14	6.50	TRUE

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

	Game	Date	Winner	Prediction
1	Broncos vs. Roosters	Apr 10	Roosters	-1.50
2	Sharks vs. Knights	Apr 10	Knights	-6.20
3	Eels vs. Titans	Apr 11	Eels	7.30
4	Panthers vs. Sea Eagles	Apr 11	Panthers	8.10
5	Warriors vs. Wests Tigers	Apr 11	Warriors	13.00
6	Dragons vs. Bulldogs	Apr 12	Bulldogs	-0.20
7	Raiders vs. Storm	Apr 12	Storm	-8.70
8	Rabbitohs vs. Cowboys	Apr 13	Rabbitohs	7.80

Super 15 Predictions for Round 9

By David Scott

Team Ratings for Round 9

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

	Current Rating	Rating at Season Start	Difference
Crusaders	10.43	10.42	0.00
Waratahs	8.34	10.00	-1.70
Hurricanes	5.72	2.89	2.80
Brumbies	4.58	2.20	2.40
Chiefs	3.79	2.23	1.60
Bulls	2.49	2.88	-0.40
Stormers	2.03	1.68	0.30
Sharks	0.17	3.91	-3.70
Blues	0.09	1.44	-1.30
Highlanders	-0.23	-2.54	2.30
Lions	-3.32	-3.39	0.10
Force	-4.56	-4.67	0.10
Cheetahs	-7.14	-5.55	-1.60
Rebels	-7.20	-9.53	2.30
Reds	-8.20	-4.98	-3.20

Performance So Far

So far there have been 53 matches played, 36 of which were correctly predicted, a success rate of 67.9%.

Here are the predictions for last week’s games.

	Game	Date	Score	Prediction	Correct
1	Hurricanes vs. Stormers	Apr 03	25 – 20	8.90	TRUE
2	Rebels vs. Reds	Apr 03	23 – 15	4.30	TRUE
3	Chiefs vs. Blues	Apr 04	23 – 16	7.80	TRUE
4	Brumbies vs. Cheetahs	Apr 04	20 – 3	16.00	TRUE
5	Sharks vs. Crusaders	Apr 04	10 – 52	-1.60	TRUE
6	Lions vs. Bulls	Apr 04	22 – 18	-2.70	FALSE

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

	Game	Date	Winner	Prediction
1	Blues vs. Brumbies	Apr 10	Blues	0.00
2	Crusaders vs. Highlanders	Apr 11	Crusaders	14.70
3	Waratahs vs. Stormers	Apr 11	Waratahs	10.80
4	Force vs. Cheetahs	Apr 11	Force	7.10
5	Bulls vs. Reds	Apr 11	Bulls	15.20
6	Lions vs. Sharks	Apr 11	Lions	0.50

View comments (2)

April 7, 2015

Briefly

By Thomas Lumley

NPR’s Science Friday covers BAHfest, a competition to produce what look like scientific arguments for nutty conclusions. Hilarious, but also important: serious and scammy pseudoscience uses the same tricks.
Emma Pierson, a scientist who studies dating analyses a year of emails with her boyfriend
Him: You’re going to find some weird pattern and break up with me.

Me: Either that will be warranted by the data, in which case it’s a good thing, or it won’t, in which case I’m a bad statistician. Are you saying I’m a bad statistician?
And a post by Emma Pierson at 538.com: “people just want to date themselves”

Another story about changes in cancer risk that just uses number of diagnoses, without even gesturing in the direction of screening bias.

Evils of Axis

By Thomas Lumley

First, from Mother Jones magazine, via Twitter

The impact of the carbon tax looks impressive, but this is a bar chart — it starts at zero and they’ve only shown the top fifth of it.

They do link to the data, the quarterly Greenhouse Gas Inventory update. In that report, Figure 8 is

The dotted line is the same data as the bar chart, except that the dotted line has data for every quarter and the bar chart has data only for the July-September quarter each year. And the line chart has a wider range on the vertical axis — it doesn’t go down to zero, but it isn’t a bar chart, so it doesn’t have to. The other point about the line chart is that there’s a solid line there as well. The solid line is adjusted for seasonal variation and weather. If you wanted to know about real changes in how Australians are using energy, that’s the line you’d use.

Second, a beautiful map of CO₂ emissions from fossil fuel combustion, from the Washington Post via Flowing Data

The ‘vertical’ scale here is a colour scale; what’s misleading is that it’s a logarithmic scale. The map makes it look as if a large fraction of CO₂ emission comes from transporting stuff through empty areas, but the pale beige indicates emissions thousands of times lower than in the urban/suburban areas. Red ink isn’t anywhere close to being proportional to CO₂.

What’s wrong with this picture?

By Thomas Lumley

So, I was on a plane from Sydney yesterday that was old enough they told us to switch off our books half an hour before landing. As a result, I actually looked at the Auckland information on the flight map channel (photo taken after we were allowed technology, naturally):

It’s interesting to see where these numbers come from, given all the different ways these things can be defined. Two of these numbers are inconsistent with each other and somewhat obsolete, and the third isn’t even wrong.

According to the Google, the population number 1,377,200 is the June 2011 estimate of the urban population of the Auckland metropolitan area. Ok, that’s a bit old but so was the plane. Slightly more strange is that StatsNZ thinks the urban Auckland population at 30 June 2011 was 1,351,200, but that’s probably a matter of projections being made in advance and then adjusted as more information comes in. The current (June 2014) estimate is 1,413,500.

So if the population is urban Auckland, what’s the area? With a bit of searching, you can find it’s the area of the old Auckland City, the central Auckland isthmus plus various islands. Auckland City had a population of 450,000 when it was absorbed into the Supercity in 2010. The population and area numbers are for very different entities, and the population number, although old, dates from after the area number became completely obsolete.

The area that goes with the 1,377,200 number is 1,102.9km², the size of the Statistical Urban Area. You could reasonably want the urbanised area (483km²) or the Metropolitan Urban Limits (560km²) as better summaries of the size of Auckland, but they don’t match the quoted population.

That leaves elevation. The picture next to the statistics shows that 78m is not a completely satisfactory characterisation of the elevation of Auckland. The blue stuff with boats floating on it is at sea level (up to tidal variation). Here’s a map (from FloodMap.net) of Auckland elevation; the change from pink to red is at 75m.

Overall, population and area, which could have multiple satisfactory definitions, are defined incompatibly with each other. Elevation doesn’t really have a satisfactory definition, but isn’t 78m.

View comments (2)

April 6, 2015

Stat of the Week Competition: April 4 – 10 2015

By Rachel Cunliffe

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday April 10 2015.
Statistics can be bad, exemplary or fascinating.
The statistic must be in the NZ media during the period of April 4 – 10 2015 inclusive.
Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

View comments (2)

Stats Chat

Briefly

Odds and probabilities

Graph of the week

Height and heart attack: genetic determinism is still wrong

NRL Predictions for Round 6

Team Ratings for Round 6

Performance So Far

Predictions for Round 6

Super 15 Predictions for Round 9

Team Ratings for Round 9

Performance So Far

Predictions for Round 9

Briefly

Evils of Axis

What’s wrong with this picture?

Stat of the Week Competition: April 4 – 10 2015

Recent comments

Popular posts

Latest posts

All topics

Recommended sites

Subscribe:

Receive our posts via email:

Team Ratings for Round 6

Performance So Far

Predictions for Round 6

Team Ratings for Round 9

Performance So Far

Predictions for Round 9

Recent comments

Popular posts

Latest posts

All topics

Recommended sites