March 8, 2018

“Causal” is only the start

Jamie Morton has an interesting story in the Herald, reporting on research by Wellington firm Dot Loves Data.

They then investigated how well they all predicted the occurrence of assaults at “peak” times – between 10pm and 3am on weekends – and otherwise in “off-peak” times.

Unsurprisingly, a disproportionate number of assaults happened during peak times – but also within a very short distance of taverns.

The figures showed a much higher proportion of assault occurred in more deprived areas – and that, in off-peak times, socio-economic status proved a better predictor of assault than the nearness or number of bars.

Unsuprisingly, the police were unsurprised.

This isn’t just correlation: with good-quality location data and the difference between peak and other times, it’s not just a coincidence that the assaults happened near bars, nor is it just due to population density.  The closeness of the bars and the assaults also argues against the simple reverse-causation explanation: that bars are just sited near their customers, and it’s the customers who are the problem.

So, it looks as if you can predict violent crimes from the location of bars (which would be more useful if you couldn’t just cut out the middleman and predict violent crimes from the locations of violent crimes).  And if we moved the bars, the assaults would probably move with them: if we switched a florist’s shop and a bar, the assaults wouldn’t keep happening outside the florist’s.

What this doesn’t tell us directly is what would happen if we dramatically reduced the number of bars.  It might be that we’d reduce violent crime. Or it might be that it would concentrate around the smaller number of bars. Or it might be that the relationship between bars and fights would weaken: people might get drunk and have fights in a wider range of convenient locations.

It’s hard to predict the impact of changes in regulation that are intended to have large effects on human behaviour — which is why it’s important to evaluate the impact of new rules, and ideally to have some automatic way of removing them if they didn’t do what they were supposed to.  Like the ban on pseudoephedrine in cold medicine.

March 6, 2018

Quantifying fairness

A bit more technical than usual, but definitely worth reading: “Reflections on Quantitative Fairness

A couple of less-technical excerpts

Much communication consists of taking one or another of these fairness concepts as obvious or axiomatic and asserting the violation of that principle as a political or moral gotcha. Formalization should not be regarded as a panacea in these debates but perhaps it can help to cement the points that:

  • a lack of clarity can conceal a debate with real content and stakes
  • differences in priorities and understandings of fairness are actually unresolved and in principle unresolvable without trade-offs

and

As statistical thinkers in the political sphere we should be aware of the hazards of supplanting politics by an expert discourse. In general, every statistical intervention to a conversation tends to raise the technical bar of entry, until it is reduced to a conversation between technical experts. As a result, in matters of criminal justice, public health, and employment, the key stakeholders, whose stakes are human stakes, and who typically lack a statistical background, can easily fall out of the conversation.

So are we speaking statistics to power? Or are we merely providing that power with new tools for the marginalization of unquantified political concerns? What is the value of this quantitative fairness conversation to a person or community whose concerns will not be quantified for another decade, if ever?

That is: it’s worth trying to be clear about what the actual question is, but we have to be careful in doing that not to push out the people who know the answer.

Super 15 Predictions for Round 4

Team Ratings for Round 4

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 15.86 15.23 0.60
Hurricanes 15.61 16.18 -0.60
Lions 13.19 13.81 -0.60
Highlanders 9.87 10.29 -0.40
Chiefs 8.60 9.29 -0.70
Sharks 1.03 1.02 0.00
Stormers 0.99 1.48 -0.50
Brumbies 0.19 1.75 -1.60
Blues 0.11 -0.24 0.30
Waratahs -2.88 -3.92 1.00
Bulls -3.70 -4.79 1.10
Jaguares -4.98 -4.64 -0.30
Reds -10.14 -9.47 -0.70
Rebels -12.12 -14.96 2.80
Sunwolves -19.04 -18.42 -0.60

 

Performance So Far

So far there have been 16 matches played, 11 of which were correctly predicted, a success rate of 68.8%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blues vs. Chiefs Mar 02 21 – 27 -4.80 TRUE
2 Reds vs. Brumbies Mar 02 18 – 10 -8.90 FALSE
3 Crusaders vs. Stormers Mar 03 45 – 28 19.10 TRUE
4 Sunwolves vs. Rebels Mar 03 17 – 37 -0.60 TRUE
5 Sharks vs. Waratahs Mar 03 24 – 24 9.00 FALSE
6 Bulls vs. Lions Mar 03 35 – 49 -13.30 TRUE
7 Jaguares vs. Hurricanes Mar 03 9 – 34 -15.40 TRUE

 

Predictions for Round 4

Here are the predictions for Round 4. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Stormers Mar 09 Highlanders 12.90
2 Rebels vs. Brumbies Mar 09 Brumbies -8.80
3 Hurricanes vs. Crusaders Mar 10 Hurricanes 3.30
4 Reds vs. Bulls Mar 10 Bulls -2.40
5 Sharks vs. Sunwolves Mar 10 Sharks 24.10
6 Lions vs. Blues Mar 10 Lions 17.10
7 Jaguares vs. Waratahs Mar 10 Jaguares 1.90

 

NRL Predictions for Round 1

Team Ratings for Round 1

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 16.73 16.73 -0.00
Broncos 4.78 4.78 -0.00
Raiders 3.50 3.50 -0.00
Cowboys 2.97 2.97 0.00
Panthers 2.64 2.64 0.00
Sharks 2.20 2.20 -0.00
Eels 1.51 1.51 -0.00
Roosters 0.13 0.13 -0.00
Dragons -0.45 -0.45 -0.00
Sea Eagles -1.07 -1.07 -0.00
Bulldogs -3.43 -3.43 -0.00
Wests Tigers -3.63 -3.63 0.00
Rabbitohs -3.90 -3.90 -0.00
Warriors -6.97 -6.97 0.00
Knights -8.43 -8.43 0.00
Titans -8.91 -8.91 0.00

 

Predictions for Round 1

Here are the predictions for Round 1. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Dragons vs. Broncos Mar 08 Broncos -2.20
2 Knights vs. Sea Eagles Mar 09 Sea Eagles -4.40
3 Cowboys vs. Sharks Mar 09 Cowboys 3.80
4 Wests Tigers vs. Roosters Mar 10 Roosters -0.80
5 Rabbitohs vs. Warriors Mar 10 Rabbitohs 7.60
6 Bulldogs vs. Storm Mar 10 Storm -17.20
7 Panthers vs. Eels Mar 11 Panthers 4.10
8 Titans vs. Raiders Mar 11 Raiders -9.40

 

March 5, 2018

Briefly

  • The gender gap: JP Morgan claims to pay its women employees 99% of what the men get. Felix Salmon and Matt Levine both take on this statistic: it doesn’t show women are paid the same (they aren’t), it just argues against one particular mechanism for the pay gap.
  • “Starting with no knowledge at all of what it was seeing, the neural network had to make up rules about which images should be labeled “sheep”. And it looks like it hasn’t realized that “sheep” means the actual animal, not just a sort of treeless grassiness.” Janelle Shane.
  • Translation is another example of the amazingly-good results networks can get, but with no grip on what’s actually going on. Douglas Hofstatder writes at the Atlantic about “The Shallowness of Google Translate“, and Mark Liberman at Language Log shows how it will translate random sequences of vowels into Hawaiian gibberish.
  • David Spiegelhalter on how to stop being so easily manipulated by misleading statistics
  • Tickets bought online for NZ Lotto are more likely to win. It’s obvious that there has to be a boring explanation for this. I suggested one that fitted the data.
February 27, 2018

Super 15 Predictions for Round 3

Team Ratings for Round 3

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 15.98 15.23 0.80
Hurricanes 15.04 16.18 -1.10
Lions 13.15 13.81 -0.70
Highlanders 9.87 10.29 -0.40
Chiefs 8.53 9.29 -0.80
Sharks 1.57 1.02 0.60
Brumbies 1.20 1.75 -0.50
Stormers 0.86 1.48 -0.60
Blues 0.18 -0.24 0.40
Waratahs -3.42 -3.92 0.50
Bulls -3.65 -4.79 1.10
Jaguares -4.41 -4.64 0.20
Reds -11.15 -9.47 -1.70
Rebels -13.29 -14.96 1.70
Sunwolves -17.87 -18.42 0.60

 

Performance So Far

So far there have been 9 matches played, 6 of which were correctly predicted, a success rate of 66.7%.
Here are the predictions for last week’s games

Game Date Score Prediction Correct
1 Highlanders vs. Blues Feb 23 41 – 34 14.00 TRUE
2 Rebels vs. Reds Feb 23 45 – 19 -2.00 FALSE
3 Sunwolves vs. Brumbies Feb 24 25 – 32 -16.20 TRUE
4 Crusaders vs. Chiefs Feb 24 45 – 23 9.40 TRUE
5 Waratahs vs. Stormers Feb 24 34 – 27 -1.30 FALSE
6 Lions vs. Jaguares Feb 24 47 – 27 21.80 TRUE
7 Bulls vs. Hurricanes Feb 24 21 – 19 -17.00 FALSE

 

Predictions for Round 3

Here are the predictions for Round 3. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Blues vs. Chiefs Mar 02 Chiefs -4.80
2 Reds vs. Brumbies Mar 02 Brumbies -8.90
3 Crusaders vs. Stormers Mar 03 Crusaders 19.10
4 Sunwolves vs. Rebels Mar 03 Rebels -0.60
5 Sharks vs. Waratahs Mar 03 Sharks 9.00
6 Bulls vs. Lions Mar 03 Lions -13.30
7 Jaguares vs. Hurricanes Mar 03 Hurricanes -15.40

 

February 24, 2018

Scare stories: a pain in the neck

From the Herald, from the Daily Mail, on the dangers of painkillers

Researchers have today revealed the exact risk of having a heart attack or stroke from taking several common painkillers.

They discovered, on average, one in 330 adults who have been taking ibuprofen will experience a heart attack or stroke within four weeks.

However, the drug, costing as little as 20c a tablet and available in supermarkets and dairies, was found to be three times less dangerous than celecoxib, which will lead to one in 105 adults experiencing a heart attack or stroke.

Now, that’s obviously not true for people just taking ibuprofen for an injury or a headache. So what’s the true story?

The research paper is here. As the story says, it followed up 56,000 people in Taiwan with high blood pressure.  They were interested in a group of painkillers called “COX-selective” that have a lower risk of causing ulcers and stomach bleeding, but potentially a higher risk of heart attack and stroke.  One familiar COX-selective painkiller in NZ is Voltaren, familiar non-selective ones are ibuprofen and naproxen — but the study wasn’t looking at over-the-counter medications bought in supermarkets and dairies, just at people starting prescriptions.

Over the 7927 people starting prescriptions for ibuprofen, 24 ended up getting a heart attack or stroke, after an average of two weeks’ treatment. Of the  1,779 starting celecoxib prescriptions, 17 ended up getting a heart attack or stroke, after an average of about three weeks’ treatment.  Overall, there was a bit more than one heart attack per ten people per year for those prescribed COX-selective drugs and a bit less than one heart attack per ten people per year for those prescribed non-selective drugs.  And there’s no comparison with people who weren’t taking painkillers

You might wonder how numbers like 24 and 17 are large enough to say anything reliable. They aren’t. The “exact risk” of 1 in 330 from the lead is actually a range from something like 1 in 200 to 1 in 500, even before you consider the uncertainties in generalising from middle-aged to elderly Taiwanese people with hypertension to other groups.

This study on its own provides only very weak evidence that COX-selective drugs are more dangerous. The conclusion is plausible for all sorts of reasons, but it’s hardly conclusive.  Like it says on the packet, don’t take any of these medications for weeks at a time without consulting a more reliable source than the Daily Mail.

Diet and genes: not so simple

One of the potential benefits of genetics in medicine and public health comes when two interventions are about equally good on average, but with a lot of variation between people.  We can hope that genetics explains which intervention works for which people, and lets us pick the right one for each person. So far, this hasn’t happened.

It didn’t happen again this week, with the results of a randomised trial comparing low-fat and low-carb diets.  A group of basically healthy but overweight or obese adults were randomly allocated to being recommended a low-fat diet or a low-carb diet.  After a year, the average weight loss in each group was about 6kg.

There are some genetic variants that have been found in previous studies to predict the success of low-fat vs low-carb diets.  This trial was set up to look at those genetic variants: even though the low-fat diet wasn’t better overall, was it better in people who were expected to be genetically suited to it? Here’s a graph from the research paper showing the distribution of weight losses in each group:


There’s no sign that genetics is helping.

It’s still plausible that genetic differences contribute, and even that we could use them to choose diets if we knew more. But right now, if you want to know whether you’ll lose weight on a particular (reasonable and moderate) diet, the only way to tell is to try it.

February 23, 2018

Briefly

  • Data visibility as a political act: Ben Goldacre and co-conspirators have set up a webpage tracking clinical trials that are violating the FDA Amendment Act (2007) by not having reported any results.  It only became possible to violate the Act this Monday, so the compliance is fairly high so far, nearly 90%.
  • Politician Sam is an expert system from Victoria University Wellington that’s trying to learn NZ political views.  That’s not an unreasonable thing to try, but reading “Unlike a human politician, I consider everyone’s position, without bias, when making decisions” doesn’t make me more optimistic about the project.
  • Which NZ songs get streamed the most here and overseas? Gareth Shute at the Spinoff
  • “Count on Stats” is an effort by the American Statistical Association to rebuild public confidence in US official statistics.
  • Alice Zhao analysed text messages with her (now) husband from the year they married and the year they started dating — a nice illustration of what you can miss by looking at just one source of information.
February 20, 2018

Super 15 Predictions for Round 2

Team Ratings for Round 2

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

 

Current Rating Rating at Season Start Difference
Hurricanes 16.18 16.18 0.00
Crusaders 15.23 15.23 0.00
Lions 13.25 13.81 -0.60
Highlanders 10.29 10.29 -0.00
Chiefs 9.29 9.29 0.00
Brumbies 1.75 1.75 0.00
Sharks 1.57 1.02 0.60
Stormers 1.36 1.48 -0.10
Blues -0.24 -0.24 -0.00
Waratahs -3.92 -3.92 -0.00
Jaguares -4.51 -4.64 0.10
Bulls -4.79 -4.79 0.00
Reds -9.47 -9.47 0.00
Rebels -14.96 -14.96 0.00
Sunwolves -18.42 -18.42 0.00

 

Performance So Far

So far there have been 2 matches played, 2 of which were correctly predicted, a success rate of 100%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Stormers vs. Jaguares Feb 17 28 – 20 10.10 TRUE
2 Lions vs. Sharks Feb 17 26 – 19 16.30 TRUE

 

Predictions for Round 2

Here are the predictions for Round 2. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Blues Feb 23 Highlanders 14.00
2 Rebels vs. Reds Feb 23 Reds -2.00
3 Sunwolves vs. Brumbies Feb 24 Brumbies -16.20
4 Crusaders vs. Chiefs Feb 24 Crusaders 9.40
5 Waratahs vs. Stormers Feb 24 Stormers -1.30
6 Lions vs. Jaguares Feb 24 Lions 21.80
7 Bulls vs. Hurricanes Feb 24 Hurricanes -17.00