October 17, 2018

Briefly

  • The Crime Machine: two podcast episodes (with transcripts) on New York City police and the good and bad effects of trying to measure crime and police effort
  • You may have heard that Senator Elizabeth Warren had a genetic ancestry test. Carl Zimmer has a very good Twitter thread on what the results don’t mean
  • The Robots Learn By Watching Us’. Bloomberg columnist Matt Levine on training computers to behave like humans in stock picking and employment
  • The New York Times has a map of every building in the United States.
  • The Australian Bureau of Statistics is having its funding cut over time, which is probably not a good thing (via).  This graph, though:
    For barcharts and other area charts, the area is carrying the information, so you can’t just chop the bottom half off the graph. I put it back on:
  •  
October 16, 2018

Restart a heart

I see from Twitter that it’s World Restart A Heart Day,  encouraging people to learn CPR. Which you should do. It’s not hard. Nowadays there are even Spotify playlists of songs with the right beat for chest compressions.

However, two statistical points:

  1. Even if you don’t know CPR, the nice people at the ambulance service (111 emergency number in NZ) can tell you what to do. This works well enough that there have been randomised trials (in the US) comparing the effectiveness of different sets of instructions
  2. The success rate of CPR in real life is not as high as on television. If you give someone CPR it’s quite likely not to work, and that won’t be your fault. The success rate of CPR is still higher than no CPR.

Mitre 10 Cup Predictions for the Mitre 10 Cup Semi-Finals

Team Ratings for the Mitre 10 Cup Semi-Finals

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Canterbury 13.65 15.32 -1.70
Wellington 13.30 12.18 1.10
Tasman 9.50 2.62 6.90
Auckland 9.27 -0.50 9.80
Waikato 6.49 -3.24 9.70
North Harbour 5.56 6.42 -0.90
Otago 0.34 0.33 0.00
Counties Manukau -2.03 1.84 -3.90
Taranaki -4.85 6.58 -11.40
Bay of Plenty -4.95 0.27 -5.20
Northland -6.02 -3.45 -2.60
Hawke’s Bay -7.21 -13.00 5.80
Manawatu -11.83 -4.36 -7.50
Southland -23.39 -23.17 -0.20

 

Performance So Far

So far there have been 70 matches played, 50 of which were correctly predicted, a success rate of 71.4%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Southland vs. Auckland Oct 10 8 – 56 -23.80 TRUE
2 Tasman vs. Hawke’s Bay Oct 11 29 – 0 18.90 TRUE
3 Taranaki vs. Wellington Oct 12 10 – 34 -12.00 TRUE
4 Bay of Plenty vs. Northland Oct 13 38 – 35 5.50 TRUE
5 Waikato vs. Otago Oct 13 19 – 23 13.30 FALSE
6 Counties Manukau vs. Canterbury Oct 13 14 – 19 -13.10 TRUE
7 Auckland vs. North Harbour Oct 14 45 – 29 3.70 TRUE
8 Manawatu vs. Southland Oct 14 38 – 26 14.20 TRUE

 

Predictions for the Mitre 10 Cup Semi-Finals

Here are the predictions for the Mitre 10 Cup Semi-Finals. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Waikato vs. Northland Oct 19 Waikato 16.50
2 Tasman vs. Canterbury Oct 19 Canterbury -0.20
3 Auckland vs. Wellington Oct 20 Wellington -0.00
4 Otago vs. Hawke’s Bay Oct 20 Otago 11.50

 

Currie Cup Predictions for the SemiFinals

Team Ratings for the SemiFinals

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Note that Cheetahs2 refers the Cheetahs team when there is a Pro14 match. The assumption is that the team playing in the Pro14 is the top team and the Currie Cup team is essentially a second team.


Current Rating Rating at Season Start Difference
Western Province 8.57 4.66 3.90
Sharks 4.28 4.18 0.10
Lions 2.80 3.23 -0.40
Cheetahs 2.23 3.86 -1.60
Blue Bulls -0.22 0.94 -1.20
Pumas -8.17 -8.36 0.20
Griquas -11.05 -9.78 -1.30
Cheetahs2 -29.69 -30.00 0.30

 

Performance So Far

So far there have been 21 matches played, 18 of which were correctly predicted, a success rate of 85.7%.
Here are the predictions for last week’s games.


Game Date Score Prediction Correct
1 Pumas vs. Lions Oct 12 21 – 33 -5.40 TRUE
2 Griquas vs. Sharks Oct 13 11 – 41 -9.50 TRUE
3 Blue Bulls vs. Western Province Oct 13 7 – 34 -2.80 TRUE

 

Predictions for the SemiFinals

Here are the predictions for the SemiFinals. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.


Game Date Winner Prediction
1 Sharks vs. Lions Oct 20 Sharks 6.00
2 Western Province vs. Blue Bulls Oct 20 Western Province 13.30

 

October 11, 2018

Briefly

  • How good is ‘good’? YouGov looks at how various words rate in the UK. “Average” gets 5 on a 10-point scale, which is more positive than I think it would rate here, though they didn’t ask about “a bit average” or “pretty average”
  • Nepal introduced a ban on internet porn. It cut Nepal’s traffic to porn site xHamster by about 50%. For about two weeks.  (not NSFW apart from being about porn)
  • The UK has a Statistics Authority to remind government and parliament not to misuse official statistics. This week, in correspondence with the Department for Education: “figures were presented in such a way as to misrepresent changes in school funding.
    In the tweet, school spending figures were exaggerated by using
    a truncated axis, and by not adjusting for per pupil spend.”  (Note: using nominal, aggregate figures rather than real, per capita figures to report spending and truncating bar chart axes are just as wrong here as in the UK)
  • Seek.co.nz are promoting their guide to salaries on Twitter again. This is based on advertised salaries/wages for positions advertised on Seek, not actual money paid to any group — and, for example, I’d be pleasantly surprised if “Kitchen and Sandwich Hands” actually averaged over $41,000 annual wages.

Carefully taught

Q: It’s shocking how computers can be so sexist

A: Not really the computers; more the users

Q: But they took this computer program and showed it lots of people’s applications, and it downrated the ones from women

A: Yes, but that’s because they also trained it with information about which applications they thought were best, and it learned from them that women’s applications weren’t as good

Q: Couldn’t it just have seen that more men that women were accepted because more men applied, and over-generalised?

A: Not really. It should be looking at the probability of acceptance, which wouldn’t be affected by overall proportions, but would be affected by human bias.

Q: Could the bias all have come in via word associations, like in that ‘how to make a racist AI’ blog post.

A: Perhaps. But only if they weren’t really trying. In particular, however the bias came in, they should have been aware of the potential and audited the results. I mean, this is a respectable organisation; you’d assume they were that responsible

Q: That sounds like a simple piece of advice

A: Yes, but even 30 years later, people are still making the same mistakes

Q: Wait, what? Aren’t we talking about Amazon?

A: No, St George’s Hospital Medical School, London.  In the BMJ in 1988, based on a program written in the 1970s

October 9, 2018

Mitre 10 Cup Predictions for Round 9

Team Ratings for Round 9

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Canterbury 14.38 15.32 -0.90
Wellington 12.22 12.18 0.00
Tasman 8.59 2.62 6.00
Waikato 8.04 -3.24 11.30
North Harbour 6.47 6.42 0.10
Auckland 6.18 -0.50 6.70
Otago -1.22 0.33 -1.50
Counties Manukau -2.77 1.84 -4.60
Taranaki -3.77 6.58 -10.30
Bay of Plenty -4.72 0.27 -5.00
Northland -6.24 -3.45 -2.80
Hawke’s Bay -6.30 -13.00 6.70
Manawatu -11.44 -4.36 -7.10
Southland -21.60 -23.17 1.60

 

Performance So Far

So far there have been 62 matches played, 43 of which were correctly predicted, a success rate of 69.4%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Otago vs. Bay of Plenty Oct 03 45 – 34 8.30 TRUE
2 Wellington vs. Auckland Oct 04 24 – 29 13.30 FALSE
3 Hawke’s Bay vs. Manawatu Oct 05 45 – 17 5.00 TRUE
4 Northland vs. Waikato Oct 06 28 – 71 -3.10 TRUE
5 North Harbour vs. Counties Manukau Oct 06 36 – 26 14.00 TRUE
6 Canterbury vs. Taranaki Oct 06 41 – 7 19.50 TRUE
7 Southland vs. Bay of Plenty Oct 07 22 – 26 -15.10 TRUE
8 Otago vs. Tasman Oct 07 21 – 47 -1.60 TRUE

 

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Southland vs. Auckland Oct 10 Auckland -23.80
2 Tasman vs. Hawke’s Bay Oct 11 Tasman 18.90
3 Taranaki vs. Wellington Oct 12 Wellington -12.00
4 Bay of Plenty vs. Northland Oct 13 Bay of Plenty 5.50
5 Waikato vs. Otago Oct 13 Waikato 13.30
6 Counties Manukau vs. Canterbury Oct 13 Canterbury -13.10
7 Auckland vs. North Harbour Oct 14 Auckland 3.70
8 Manawatu vs. Southland Oct 14 Manawatu 14.20

 

October 5, 2018

Briefly

  • “Data for Sale” at Stuff, on data ethics
  • “How a math genius hacked OkCupid to find true love” at Wired
  • Chris Knox interviews Cathy O’Neil, who is in New Zealand for a Stats NZ ‘Data Summit’.  StatsChat readers will already be familiar with Dr O’Neil aka mathbabe.org
  •  ‘People disagree about what fairness looks like. That’s true in general, and also true when you try to write down a mathematical equation and say, “This is the definition of fairness.”’ An interview with Dr Kristian Lum  of the Human Rights Data Analysis Group
  • A group at Johns Hopkins Dept of Biostatistics have been working to reduce the scarcity value of data science. They have a new program: “excited to announce the first part of our new system, a new set of massive online open courses called Chromebook Data Science. These MOOCs are for anyone from high schoolers on up to get into data science. If you can read and follow instructions you can learn data science from these courses!” (There’s obviously a potential conflict of interest here with Auckland’s data science programs, but I think there’s a separate market for in-person training where you can ask questions)
  • “Which neighborhoods in America offer children the best chance at a better life than their parents? The Opportunity Atlas uses anonymous data following 20 million Americans from childhood to their mid-thirties to answer this question.” There’s an obvious difficulty with any dataset like this — if you’re looking at people in their mid-thirties, they were children quite a while ago and things may have changed.  Still interesting to explore.
October 4, 2018

Australia votes for a shag

It’s time for StatsChat’s favourite bogus poll: Forest & Bird’s Bird of the Year.

In contrast to most bogus online polls, Bird of the Year doesn’t pretend to be anything more than a publicity stunt, and no-one seriously believes the huge year-to-year variation in the results has any real meaning in popular opinion

Bird of the Year still has more quality control than most bogus polls. They require a unique email address per vote, and this year have monitoring by Dragonfly Data Science.

Dragonfly noticed an apparent attempt to hack the vote last night, with a large number of votes from a single Australian IP address for the cormorants or shags, kawau in te reo.

Yes, Bird of the Year is a joke. But any other online clicky poll is at least as much of a joke.

(PS: for the sake of people whose tolerance for this sort of thing is lower than yours, if you tweet about Bird of the Year, use the hashtag)

October 2, 2018

International comparisons

From Pew Research, via Twitter

That list of European countries, presumably intended to give the most meaningful comparison to the USA, is a bit unusual.

It includes Malta and Cyprus and Lichtenstein, but doesn’t include Ireland or the UK.