June 15, 2017

We’re not number three

From the Twitter, via Graeme Edgeler

childpov

As Graeme points out, the nice thing about having a link included is that you can check the report (PDF, p8) and find out the claim isn’t true — at least by the source’s definitions.

This one is redrawn to use all the data, with the countries previously left out coloured grey. There’s a pattern.
fullchildpov

 

 

June 14, 2017

Comparing sources

The Herald has a front-page-link “Daily aspirin deadlier than we thought”, for a real headline “Daily aspirin behind more than 3000 deaths a year, study suggests”.  The story (from the Daily Telegraph) begins

Taking a daily aspirin is far more dangerous than was thought, causing more than 3000 deaths a year, a major study suggests.

Millions of pensioners should reconsider taking pills which are taken by almost half of elderly people to ward off heart attacks and strokes, researchers said.

The study by Oxford University found that those over the age of 75 who take the blood-thinning pills are 10 times more likely than younger patients to suffer disabling or fatal bleeds.

The BBC also has a report on this research. Their headline is Aspirin ‘major bleed’ warning for over-75s, and the story starts

People over 75 taking daily aspirin after a stroke or heart attack are at higher risk of major – and sometimes fatal – stomach bleeds than previously thought, research in the Lancet shows.

Scientists say that, to reduce these risks, older people should also take stomach-protecting PPI pills.

But they insist aspirin has important benefits – such as preventing heart attacks – that outweigh the risks.

The basic message from the same underlying research seems very different. Sadly, neither story links to the open-access research paper, which has very good sections on the background to the research and what this new study added.

Basically, we know that aspirin reduces blood clotting.  This has good effects — reducing the risk of heart attacks and strokes — and also bad effects — increasing the risk of bleeding.   We do randomised trials to find out whether the benefits exceed the risks, and in the randomised trials they did for aspirin. However, the randomised trials were mostly in people under 75.

The new study looks at older people, but it wasn’t a randomised trial: everyone in the study was taking aspirin, and there was no control group.  The main comparisons were by age. Serious stomach bleeding was a lot more common in the oldest people in the study, so unless the beneficial effects of aspirin were also larger in these people, the tradeoff might no longer be favourable.

In particular, as the Herald/Telegraph story says, the tradeoff might be unfavourable for old-enough people who hadn’t already had a heart attack or stroke. That’s one important reason for the difference between the two stories.  The research only looked at people who had previously had a heart attack or stroke (or some similar reason to take aspirin). The BBC story focused mostly on these people (who should still take aspirin, but also maybe an anti-ulcer drug); the Herald/Telegraph story focused mostly on those taking aspirin purely as a precaution.

So, even though the Herald/Telegraph story was going for the scare headlines, the content was potentially helpful: absent any news coverage, the healthy aspirin users would be less likely to bring up the issue with their doctors.

 

June 13, 2017

Appropriate subdivisions

From Public Policy Polling on Twitter, a finding that voters are less likely to vote for a member of Congress if they supported the Republican anti-healthcare bill

ppp

The problem with this sort of claim, as we’ve seen for NZ examples in the past, is that more than 24% of voters already have ‘not in a million years’ as the baseline willingness-to-support for some candidates. Maybe this vote would just change that to `not in two million years’.

Since Public Policy Polling are a reputable survey company (even though I’m not a fan) , they publish detailed survey results (PDF).  In these results, they break down the healthcare question by self-reported vote in the 2016 election
ppp-table
And, as you’d expect, the detailed story is different.  People who voted for Clinton think the Republican healthcare bill is terrible; people who voted for Trump think it’s basically ok. The net 24% who might change their vote might be better described a mixture of a net 50% imaginary `loss’ of people who already weren’t voting Republican, and a net 20% imaginary `gain’ of people who already were.

What’s more striking than the 24% vs 48% overall percentage is that as many as 23% of Trump voters are willing to say something negative about the bill. Still, as an indication that even the hopeful news is unclear, consider this table
ppp-table2
Only 13% of Trump voters prefer the current healthcare law, so the 23% who would penalise a Congressperson who voted for the new law includes at least 10% who actually prefer the new law or who aren’t sure.

 

NRL Predictions for Round 15

Team Ratings for Round 15

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 8.01 8.49 -0.50
Raiders 4.96 9.94 -5.00
Broncos 4.94 4.36 0.60
Sharks 4.67 5.84 -1.20
Roosters 4.17 -1.17 5.30
Cowboys 4.05 6.90 -2.90
Panthers 3.58 6.08 -2.50
Sea Eagles 1.95 -2.98 4.90
Dragons -0.54 -7.74 7.20
Warriors -1.60 -6.02 4.40
Eels -3.57 -0.81 -2.80
Bulldogs -3.76 -1.34 -2.40
Rabbitohs -4.06 -1.82 -2.20
Titans -4.21 -0.98 -3.20
Wests Tigers -8.74 -3.89 -4.80
Knights -11.88 -16.94 5.10

 

Performance So Far

So far there have been 107 matches played, 63 of which were correctly predicted, a success rate of 58.9%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Sharks vs. Storm Jun 08 13 – 18 1.20 FALSE
2 Sea Eagles vs. Knights Jun 09 18 – 14 19.80 TRUE
3 Broncos vs. Rabbitohs Jun 09 24 – 18 13.80 TRUE
4 Titans vs. Warriors Jun 10 12 – 34 5.60 FALSE
5 Panthers vs. Raiders Jun 10 24 – 20 1.70 TRUE
6 Eels vs. Cowboys Jun 10 6 – 32 -0.20 TRUE
7 Wests Tigers vs. Roosters Jun 11 18 – 40 -7.10 TRUE
8 Bulldogs vs. Dragons Jun 12 16 – 2 -2.30 FALSE

 

Predictions for Round 15

Here are the predictions for Round 15. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Rabbitohs vs. Titans Jun 16 Rabbitohs 3.70
2 Storm vs. Cowboys Jun 17 Storm 7.50
3 Sharks vs. Wests Tigers Jun 17 Sharks 16.90
4 Eels vs. Dragons Jun 18 Eels 0.50

 

June 12, 2017

Stat of the Week Competition: June 10 – 16 2017

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday June 16 2017.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of June 10 – 16 2017 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

June 7, 2017

Fraud or typos?

The Guardian saysDozens of recent clinical trials may contain wrong or falsified data, claims study

A UK anaesthetist, John Carlise, has scraped 5000 clinical-trial publications, where patients are divided randomly into two groups before treatment is assigned, and looked at whether the two groups are more similar or more different than you’d expect by chance.  His motivation appears to be that having groups which are too similar can be a sign of incompetent fraud by someone who doesn’t understand basic statistics. However, the statistical hypothesis he’s testing isn’t actually about fraud, or even about incompetent fraud.

As the research paper notes, some of the anomalous results can be explained by simple writing errors: saying “standard deviation” when you mean “standard error” — and this would, if anything, be evidence against fraud.  Even in the cases where that specific writing error isn’t plausible, looking at the paper can show data fabrication to be an unlikely explanation.  For example, in one of the papers singled out as having a big difference not explainable by the standard deviation/standard error confusion, the difference is in one blood chemistry measurement (tPA) that doesn’t play any real role in the conclusions. The data are not consistent with random error, but they also aren’t consistent with deliberate fraud.  They are more consistent with someone typing 3.2 when they meant 4.2. This would still be a problem with the paper, both because some relatively unimportant data are wrong and because it says bad things about your workflow if you are still typing Table 1 by hand in the 21st century, but it’s not of the same scale as data fabrication.

You’d think the Guardian might be more sympathetic to typos as an explanation of error.

 

Super 18 Predictions for Round 16 Game, Hurricanes vs Chiefs

Team Ratings for Round 16 Game, Hurricanes vs Chiefs

The basic method is described on my Department home page.

This week is pretty crazy, just one game from round 16 when round 15 has not been completed and won’t be for a month.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Hurricanes 17.92 13.22 4.70
Crusaders 13.98 8.75 5.20
Highlanders 11.43 9.17 2.30
Lions 10.96 7.64 3.30
Chiefs 8.49 9.75 -1.30
Brumbies 3.44 3.83 -0.40
Blues 2.65 -1.07 3.70
Sharks 1.52 0.42 1.10
Stormers 0.53 1.51 -1.00
Waratahs -0.50 5.81 -6.30
Bulls -5.20 0.29 -5.50
Jaguares -5.38 -4.36 -1.00
Force -8.85 -9.45 0.60
Cheetahs -9.83 -7.36 -2.50
Reds -10.78 -10.28 -0.50
Kings -13.53 -19.02 5.50
Rebels -15.58 -8.17 -7.40
Sunwolves -18.38 -17.76 -0.60

 

Performance So Far

So far there have been 120 matches played, 91 of which were correctly predicted, a success rate of 75.8%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blues vs. Reds Jun 02 34 – 29 14.60 TRUE
2 Crusaders vs. Highlanders Jun 03 25 – 22 6.50 TRUE
3 Chiefs vs. Waratahs Jun 03 46 – 31 12.70 TRUE
4 Brumbies vs. Rebels Jun 03 32 – 3 21.60 TRUE
5 Force vs. Hurricanes Jun 03 12 – 34 -22.90 TRUE

 

Predictions for Round 16, Hurricanes vs. Chiefs

Here are the predictions for the Round 16 game this week. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Hurricanes vs. Chiefs Jun 09 Hurricanes 12.90

 

NRL Predictions for Round 14

Team Ratings for Round 14

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 7.50 8.49 -1.00
Broncos 5.57 4.36 1.20
Sharks 5.18 5.84 -0.70
Raiders 5.17 9.94 -4.80
Panthers 3.37 6.08 -2.70
Sea Eagles 3.19 -2.98 6.20
Roosters 3.00 -1.17 4.20
Cowboys 2.07 6.90 -4.80
Dragons 0.73 -7.74 8.50
Eels -1.60 -0.81 -0.80
Titans -2.11 -0.98 -1.10
Warriors -3.71 -6.02 2.30
Rabbitohs -4.69 -1.82 -2.90
Bulldogs -5.04 -1.34 -3.70
Wests Tigers -7.57 -3.89 -3.70
Knights -13.12 -16.94 3.80

 

Performance So Far

So far there have been 99 matches played, 58 of which were correctly predicted, a success rate of 58.6%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Storm vs. Knights Jun 02 40 – 12 23.30 TRUE
2 Eels vs. Warriors Jun 02 32 – 24 5.70 TRUE
3 Dragons vs. Wests Tigers Jun 03 16 – 12 13.30 TRUE
4 Roosters vs. Broncos Jun 03 18 – 16 0.70 TRUE
5 Cowboys vs. Titans Jun 03 20 – 8 6.80 TRUE
6 Sea Eagles vs. Raiders Jun 04 21 – 20 1.60 TRUE
7 Bulldogs vs. Panthers Jun 04 0 – 38 0.90 FALSE

 

Predictions for Round 14

Here are the predictions for Round 14. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Sharks vs. Storm Jun 08 Sharks 1.20
2 Sea Eagles vs. Knights Jun 09 Sea Eagles 19.80
3 Broncos vs. Rabbitohs Jun 09 Broncos 13.80
4 Titans vs. Warriors Jun 10 Titans 5.60
5 Panthers vs. Raiders Jun 10 Panthers 1.70
6 Eels vs. Cowboys Jun 10 Cowboys -0.20
7 Wests Tigers vs. Roosters Jun 11 Roosters -7.10
8 Bulldogs vs. Dragons Jun 12 Dragons -2.30

June 5, 2017

Briefly

  • Possibly a record false positive rate:  “a substantial number of takedown requests submitted to Google are for URLs that have never been in our search index, and therefore could never have appeared in our search results… Nor is this problem limited to one submitter: in total, 99.95% of all URLs processed from our Trusted Copyright Removal Program in January 2017 were not in our index” (Google submission to Register of Copyrights(PDF), via Techdirt)
  • Problem with rental costs in Canada’s historical CPI “the clerks who recorded the data were under an instruction that, since the CPI was to represent prices paid by better off working class families, to edit out any rental figures what were above a designated threshold. By the end of the 1950s they were throwing out more than half of the reported rents.” (Worthwhile Canadian Initiative). Data doesn’t just happen: it’s choices by people.
  • I’ve mentioned the University of Washington course “Calling Bullshit on Big Data” before. Now the New Yorker has a story about it.
  • What different sorts of things can go wrong with a statistical prediction rule? A taxonomy, from Ed Felten.
  • Explore NZ mortality rates divided up by ethnicity, income, and age
  • “What we learned from three years of interviews with data journalists, web developers and interactive editors at leading digital newsrooms” Storybench, via Alberto Cairo
  • A couple of examples from the fine UK election tradition of disinformation graphics: Scotland, London

Stat of the Week Competition: June 3 – 9 2017

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday June 9 2017.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of June 3 – 9 2017 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)