May 15, 2015

Useful science/health reporting

As I’ve commented before, most good-quality randomised trials of vitamins in humans have disappointing results. A few don’t, and it’s nice to see these reported accurately.  The Herald tells us about an  Australian trial which has found nicotinamide, a version of vitamin B3, can reduce the rate of new minor skin cancers in people who already have had a lot of them.  This isn’t especially dramatic, but for many older pale-skinned people in New Zealand, Australia, or South Africa it could reduce a recurrent medical annoyance.

The only real omission in the Herald story is the link to the research: there’s a conference abstract for a talk to be given at the American Society for Clinical Oncology conference later this month.


Update: yes, this sort of story is less impressive and has less public health significance than claiming WiFi causes brain cancer in children, but it does have the advantage of being true.

May 13, 2015

NRL Predictions for Round 10

Team Ratings for Round 10

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 8.69 9.09 -0.40
Cowboys 6.99 9.52 -2.50
Rabbitohs 6.22 13.06 -6.80
Storm 4.95 4.36 0.60
Broncos 4.20 4.03 0.20
Dragons 1.44 -1.74 3.20
Panthers 0.93 3.69 -2.80
Warriors 0.82 3.07 -2.30
Raiders -1.22 -7.09 5.90
Bulldogs -1.66 0.21 -1.90
Knights -1.83 -0.28 -1.60
Sea Eagles -2.02 2.68 -4.70
Sharks -5.85 -10.76 4.90
Titans -6.74 -8.20 1.50
Wests Tigers -6.74 -13.13 6.40
Eels -6.85 -7.19 0.30


Performance So Far

So far there have been 72 matches played, 38 of which were correctly predicted, a success rate of 52.8%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Broncos vs. Panthers May 08 8 – 5 6.80 TRUE
2 Roosters vs. Wests Tigers May 08 36 – 4 16.20 TRUE
3 Cowboys vs. Bulldogs May 09 23 – 16 12.40 TRUE
4 Raiders vs. Titans May 09 56 – 16 3.70 TRUE
5 Sharks vs. Warriors May 09 16 – 20 -2.40 TRUE
6 Eels vs. Storm May 10 10 – 28 -7.30 TRUE
7 Sea Eagles vs. Knights May 10 30 – 28 3.00 TRUE
8 Rabbitohs vs. Dragons May 11 16 – 10 8.10 TRUE


Predictions for Round 10

Here are the predictions for Round 10. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Bulldogs vs. Roosters May 15 Roosters -7.30
2 Cowboys vs. Broncos May 15 Cowboys 5.80
3 Eels vs. Warriors May 16 Warriors -3.70
4 Storm vs. Rabbitohs May 16 Storm 1.70
5 Titans vs. Sharks May 16 Titans 2.10
6 Dragons vs. Raiders May 17 Dragons 5.70
7 Knights vs. Wests Tigers May 17 Knights 7.90
8 Sea Eagles vs. Panthers May 18 Sea Eagles 0.10


Super 15 Predictions for Round 14

Team Ratings for Round 14

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 8.89 10.42 -1.50
Waratahs 5.94 10.00 -4.10
Hurricanes 5.66 2.89 2.80
Chiefs 4.32 2.23 2.10
Brumbies 3.60 2.20 1.40
Bulls 3.04 2.88 0.20
Stormers 2.35 1.68 0.70
Highlanders 1.72 -2.54 4.30
Blues -0.75 1.44 -2.20
Sharks -1.68 3.91 -5.60
Lions -2.45 -3.39 0.90
Rebels -3.48 -9.53 6.00
Force -4.80 -4.67 -0.10
Cheetahs -5.64 -5.55 -0.10
Reds -9.72 -4.98 -4.70


Performance So Far

So far there have been 86 matches played, 55 of which were correctly predicted, a success rate of 64%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Crusaders vs. Reds May 08 58 – 17 20.80 TRUE
2 Rebels vs. Blues May 08 42 – 22 -0.60 FALSE
3 Hurricanes vs. Sharks May 09 32 – 24 12.50 TRUE
4 Force vs. Waratahs May 09 18 – 11 -8.60 FALSE
5 Lions vs. Highlanders May 09 28 – 23 -0.40 FALSE
6 Stormers vs. Brumbies May 09 25 – 24 3.70 TRUE


Predictions for Round 14

Here are the predictions for Round 14. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Blues vs. Bulls May 15 Blues 0.70
2 Reds vs. Rebels May 15 Rebels -2.20
3 Hurricanes vs. Chiefs May 16 Hurricanes 5.30
4 Waratahs vs. Sharks May 16 Waratahs 12.10
5 Lions vs. Brumbies May 16 Brumbies -1.60
6 Cheetahs vs. Highlanders May 16 Highlanders -2.90



  • Rating systems are the popular way to scale ‘reputation’ statistically so it works for internet transactions between people who don’t know each other. Tom Slee has a couple of pieces (via Cosma Shalizi): Some Obvious Things About Internet Reputation Systems  and In praise of fake reviews:So the reviews that a restaurant owner believes are most likely to be fair are precisely the ones that Yelp judges to be untrustworthy….Unfortunately for restaurateurs, their opinions on trustworthy reviews are irrelevant. The company is not legally bound to be fair in its filtering and sorting activities,
  • Another example of interesting results failing to replicate, this time from a popular TED talk about posture. As the post at Data Colada points out, this is a strong non-replication: it’s not just that they didn’t see they effect, they ruled out even much weaker effects. There’s a reason statisticians go on and on about over-interpretation of single, small studies.
  • Looking at the Census data on religion, a map and set of stories from Lincoln Tan and Harkanwal Singh at the Herald
  • A rather different form of data journalism, reported at Buzzfeed (or the original paper report here). The Telegraph had a ‘tactical voting tool’ that said who you should vote for if your goal was a Labour or Tory government.  It was mostly honest, despite the paper’s well-known preferences. However, as Buzzfeed’s headline says: “The Telegraph’s Tactical Voting Tool Was Coded To Never Recommend The SNP”
  • From the LA Times, Wylie Burke on “Why whole-genome testing hurts more than it helps” (disclosure: I once co-supervised a student with Prof Burke)
  • A Slate article by a lawyer says Wyoming has ‘criminalized citizen science’ by creating a law against collecting and reporting environmental data. Now, Wyoming has created crimes of “unlawful collection of resource data” and “trespassing to unlawfully collect resource data”, but I’m pretty sure the Slate article exaggerates them.  “Unlawful collection” can only happen on private land, which the article clearly gets wrong. “Trespassing to unlawfully collect” can happen on public land, but I’m not convinced that in the National Park example there isn’t the necessary authorisation to enter the land. Presumably the law does something or they wouldn’t have bothered passing it, and it’s probably something evil, but a better article would have been nice.
May 11, 2015

Stat of the Week Competition: May 9 – 15 2015

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday May 15 2015.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of May 9 – 15 2015 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.


May 10, 2015

The problem with medical progress

Stuff says:

There has been an alarming upward trend in the costs of similar treatments, as more drugs are developed and come on to the market, new Pharmac figures show.

I would argue that this is almost precisely not the problem. The story covers two important issues, but doesn’t distinguish them well.

The first issue is that many expensive new drugs aren’t very good. To get a drug approved for marketing you don’t need to show it’s better than the current stuff, and it often isn’t. Similar treatments might still be useful to have, if they give other options for people with side effects or have more convenient dosing, but they are often more expensive.

The United States is very bad at not using treatments that are similar (or worse) but more expensive, so these drugs are a problem there. Here, we’re quite good at not using them, so they don’t matter all that much. As long as Pharmac enjoys popular support and the media doesn’t buy into too many drug-industry publicity campaigns, we can ignore the expensive new drugs that aren’t worth the cost.

A second issue is that a subset of the expensive new drugs aren’t similar. The story quotes the price differences for ‘anthracycline’ (doxorubicin or epirubicin) and two newer breast cancer drugs, docetaxel and trastuzumab, as evidence of increases over the years.

Anthracyclines haven’t gone away. In fact, they’re quite a bit less expensive now than they were in 2002. The reason Pharmac now buys docetaxel and trastuzumab is that they’re worth the extra cost for at least some women. The existence of trastuzumab is not a problem for the healthcare system, it’s an opportunity.

There is a problem coming, though: many of the new drugs have names ending in ‘mab’.

Monoclonal antibodies, ‘mab’s, are one of the classes of ‘biologics': big, complex molecules made by living cells. Making and testing generic versions of biologics (called ‘biosimilars‘) is much harder than running up off-brand doxorubicin. Even when the patents run out on the ‘mab’s and ‘ept’s, a competitive market might be a while in developing and prices will stay higher.  It’s not so much the expensive on-patent drugs that are a worrying change, it’s the prospect of expensive off-patent drugs in the future.


May 9, 2015


  • Nate Silver Ben Lauderdale carefully examines how his predictions (and everyone else’s) were so staggeringly wrong in the UK election. More people should do this sort of thing.
  • So, in a different way, did Tom Katsumi
  • There’s a US court case on whether FDA restrictions on ‘off-label’ drug advertising (ie, advertising a drug for uses it hasn’t been approved for) violates the 1st Amendment. There’s a definite chance the FDA will lose, which would strengthen the incentives to find new uses for drugs, but weaken the incentives to collect good evidence that the drug is actually effective for these uses.
  • Just over half of Tindr users are single, but that’s ok because the company “never intended it to be a dating platform.” Maybe people are just using it for the articles.
  • How lucky do you have to be for it to be evidence of insider trading? Especially when you consider what some traders will call a “once in 3 billion years” event. What isn’t mentioned here but is important is the idea of likelihood ratios: we’re not just looking at whether an event is unlikely, but whether it becomes much more likely under the alternative hypothesis that you’ve got an inside source.
  • Player characters in role-playing games accept unreasonable risks:
    Someone, we’ll call that person the Game Master, wants you to accept a single D8 die roll, with death no resurrection on a 1. But as a sweetener, if you roll 2-8 you’ll get…something nice! What would the something nice have to be for you to agree to make that roll? 
May 6, 2015

Super 15 Predictions for Round 13

Team Ratings for Round 13

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 7.73 10.42 -2.70
Waratahs 6.87 10.00 -3.10
Hurricanes 5.98 2.89 3.10
Chiefs 4.32 2.23 2.10
Brumbies 3.36 2.20 1.20
Bulls 3.04 2.88 0.20
Stormers 2.60 1.68 0.90
Highlanders 2.09 -2.54 4.60
Blues 0.43 1.44 -1.00
Sharks -2.00 3.91 -5.90
Lions -2.83 -3.39 0.60
Rebels -4.66 -9.53 4.90
Cheetahs -5.64 -5.55 -0.10
Force -5.74 -4.67 -1.10
Reds -8.56 -4.98 -3.60


Performance So Far

So far there have been 80 matches played, 52 of which were correctly predicted, a success rate of 65%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Highlanders vs. Sharks May 01 48 – 15 5.60 TRUE
2 Brumbies vs. Waratahs May 01 10 – 13 1.10 FALSE
3 Blues vs. Force May 02 41 – 24 9.70 TRUE
4 Hurricanes vs. Crusaders May 02 29 – 23 1.60 TRUE
5 Rebels vs. Chiefs May 02 16 – 15 -5.30 FALSE
6 Cheetahs vs. Stormers May 02 25 – 17 -5.90 FALSE
7 Bulls vs. Lions May 02 35 – 33 11.00 TRUE


Predictions for Round 13

Here are the predictions for Round 13. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Crusaders vs. Reds May 08 Crusaders 20.80
2 Rebels vs. Blues May 08 Blues -0.60
3 Hurricanes vs. Sharks May 09 Hurricanes 12.50
4 Force vs. Waratahs May 09 Waratahs -8.60
5 Lions vs. Highlanders May 09 Highlanders -0.40
6 Stormers vs. Brumbies May 09 Stormers 3.70


All-Blacks birth month

This graphic and the accompanying story in the Herald produced a certain amount of skeptical discussion on Twitter today.


It looks a bit as though there is an effect of birth month, and the Herald backs this up with citations to Malcolm Gladwell on ice hockey.

The first question is whether there is any real evidence of a pattern. There is, though it’s not overwhelming. If you did this for random sets of 173 people, about 1 in 80 times there would be 60 or more in the same quarter (and yes, I did use actual birth frequencies rather than just treating all quarters as equal). The story also looks at the Black Caps, where evidence is a lot weaker because the numbers are smaller.

On the other hand, we are comparing to a pre-existing hypothesis here. If you asked whether the data were a better fit to equal distribution over quarters or to Gladwell’s ice-hockey statistic of a majority in the first quarter, they are a much better fit to equal distribution over quarters.

The next step is to go slightly further than Gladwell, who is not (to put it mildly) a primary source. The fact that he says there is a study showing X is good evidence that there is a study showing X, but it isn’t terribly good evidence that X is true. His books are written to communicate an idea, not to provide balanced reporting or scientific reference.  The hockey analysis he quotes was the first study of the topic, not the last word.

It turns out that even for ice-hockey things are more complicated

Using publically available data of hockey players from 2000–2009, we find that the relative age effect, as described by Nolan and Howell (2010) and Gladwell (2008), is moderate for the average Canadian National Hockey League player and reverses when examining the most elite professional players (i.e. All-Star and Olympic Team rosters).

So, if you expect the ice-hockey phenomenon to show up in New Zealand, the ‘most elite professional players’, the All Blacks might be the wrong place to look.

On the other hand Rugby League in the UK does show very strong relative age effects even into the national teams — more like the 50% in first quarter that Gladwell quotes for ice hockey. Further evidence that things are more complicated comes from soccer. A paper (PDF) looking at junior and professional soccer found imbalances in date of birth, again getting weaker at higher levels. They also had an interesting natural experiment when the eligibility date changed in Australia, from January 1 to August 1.


As the graph shows, the change in eligibility date was followed by a change in birth-date distribution, but not how you might expect. An August 1 cutoff saw a stronger first-quarter peak than the January 1 cutoff.

Overall, it really does seem to be true that relative age effects have an impact on junior sports participation, and possibly even high-level professional acheivement. You still might not expect the ‘majority born in the first quarter’ effect to translate from the NHL as a whole to the All Blacks, and the data suggest it doesn’t.

Rather more important, however, are relative age effects in education. After all, there’s a roughly 99.9% chance that your child isn’t going to be an All Black, but education is pretty much inevitable. There’s similar evidence that the school-age cutoff has an effect on educational attainment, which is weaker than the sports effects, but impacts a lot more people. In Britain, where the school cutoff is September 1:

Analysis shows that approximately 6% fewer August-born children reached the expected level of attainment in the 3 core subjects at GCSE (English, mathematics and science) relative to September-born children (August born girls 55%; boys 44%; September born girls 61% boys 50%)

In New Zealand, with a March 1 cutoff, you’d expect worse average school performance for kids born on the dates the Herald story is recommending.

As with future All Blacks, the real issue here isn’t when to conceive. The real issue is that the system isn’t working as well for some people. The All Blacks (or more likely the Blues) might play better if they weren’t missing key players born in the wrong month. The education system, at least in the UK, would work better if it taught all children as well as it teaches those born in autumn.  One of these matters.



May 5, 2015

Civil unions down: not just same-sex

The StatsNZ press release on marriages, civil unions, and divorces to December 2014 points out the dramatic fall in same-sex civil unions with 2014 being the first full year of marriage equality. Interestingly, if you look at the detailed data, opposite-sex civil unions have also fallen by about 50%, from a low but previously stable level.