March 15, 2014

Briefly

  • Buzzfeed says “According To Pornhub, The South Watches More Gay Porn Than Any Other Part Of The U.S.” The point, for those of you not up on US sociogeography, is that the South is religiously conservative.  It turns out that’s not really what the data say. The figures aren’t % of men who watch gay porn, they are % of porn that is gay male. The data are equally consistent with straight guys in religiously conservative states watching very slightly less porn than those in other states. Data on total porn consumption are mixed.
  • ProPublica, the non-profit, public-interest journalism foundation in the US, are setting up a data shop. Data that they could just download, they’ll make available for free, but the data that took a lot of effort will cost. Interesting to see this as a data journalism funding model.
  • From ProPublica, a good example of simple arithmetic applied to unreasonable claims.

Since 2009, Dagogo-Jack has been paid at least $257,000 by Glaxo, Lilly and Merck.

“If you actually prorate that by the hours put in, it is barely more than minimum wage,” he said. (A person earning the federal minimum wage of $7.25 would have to work 24 hours a day, seven days a week for more than four years to earn Dagogo-Jack’s fees.)

March 14, 2014

The wind and the rain

Cyclone Lusi, from the earth wind animation

lusi

 

And coloured by total precipitable water (orange: dry, light blue: very wet)

lusi-water

Keep safe.

 

March 13, 2014

NRL Predictions for Round 2

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

This week I don’t have full details because I have limited internet access and am having to copy the details from my computer.
 

Predictions for Round 2

Here are the predictions for Round 2. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

  Game Date Winner Prediction
1 Sea Eagles vs. Rabbitohs Mar 14 Sea Eagles 5.20
2 Broncos vs. Cowboys Mar 14 Cowboys -3.40
3 Warriors vs. Dragons Mar 15 Warriors 7.40
4 Storm vs. Panthers Mar 15 Storm 13.10
5 Roosters vs Eels Mar 15 Roosters 30.50
6 Titans vs. Wests Tigers Mar 16 Titans 19.40
7 Knights vs. Raiders Mar 16 Knights 15.30
8 Bulldogs vs. Sharks Mar 17 Bulldogs 4.10

 

Super 15 Predictions for Round 5

Team Ratings for Round 5

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

  Current Rating Rating at Season Start Difference
Crusaders 6.27 8.80 -2.50
Sharks 5.75 4.57 1.20
Chiefs 4.67 4.38 0.30
Brumbies 4.02 4.12 -0.10
Bulls 3.92 4.87 -1.00
Waratahs 3.82 1.67 2.20
Stormers 2.58 4.38 -1.80
Reds 0.49 0.58 -0.10
Cheetahs -1.68 0.12 -1.80
Blues -1.79 -1.92 0.10
Hurricanes -1.89 -1.44 -0.50
Highlanders -3.34 -4.48 1.10
Lions -4.26 -6.93 2.70
Force -5.16 -5.37 0.20
Rebels -6.40 -6.36 -0.00

 

Performance So Far

So far there have been 22 matches played, 14 of which were correctly predicted, a success rate of 63.6%.

Here are the predictions for last week’s games.

  Game Date Score Prediction Correct
1 Hurricanes vs. Brumbies Mar 07 21- 29 -1.00 TRUE
2 Reds vs. Cheetahs Mar 07 43 – 33 5.60 TRUE
3 Crusaders vs. Stormers Mar 08 14 – 13 8.70 TRUE
4 Force vs. Rebels Mar 08 32 – 7 0.9 TRUE
5 Bulls vs. Blues Mar 08 38 – 22 8.80 TRUE
6 Sharks vs. Lions Mar 08 37 – 23 14.00 TRUE

 

Predictions for Round 5

Here are the predictions for Round 5. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

  Game Date Winner Prediction
1 Chiefs vs. Stormers Mar 14 Chiefs 6.10
2 Rebels vs. Crusaders Mar 14 Crusaders -8.70
3 Hurricanes vs. Cheetahs Mar 15 Hurricanes 3.80
4 Highlanders vs. Force Mar 15 Highlanders 5.80
5 Brumbies vs. Waratahs Mar 15 Brumbies 2.70
6 Lions vs. Blues Mar 15 Lions 1.50
7 Sharks vs. Reds Mar 15 Sharks 7.80

 

March 11, 2014

Predicting dementia

The Herald has a story about a potential blood test for dementia, which gives the opportunity to talk about an important statistical issue. The research seems to be good, and the results are plausible, though they need to be confirmed in a separate, larger sample before they can really be believed. Also, the predictions so far are just for mild cognitive impairment, not actual dementia. But it’s the description of the accuracy of the test that might be misleading.

The test had 90% sensitivity — 90% of who developed cognitive impairment tested positive. It had 90% specificity — 90% of those who who did not develop cognitive impairment tested negative.  That’s what is described in the story as 90% accuracy.  What a user would care about is the positive predictive value: if you test positive, how likely are you to get cognitive impairment?

In the study 451 people started out cognitively normal; 28 of these developed impairment, the other 423 did not. The test would be correctly positive for about 25 of the 28, and correctly negative for about 381 of the 423. So, of the 25+42=67 who test positive, less than 40% will develop impairment. That’s reasonable for a  diagnostic test but a bit low for a screening test in healthy people.

Where the test is more immediately relevant is in designing clinical trials. So far, attempts to affect Alzheimer’s Disease progression have failed, though there are some modestly effective symptomatic treatments. It’s possible that the treatments are doing the right thing but that clinical illness is too late, so there’s a lot of interest in testing treatments very early in the process. A test like the new one could be very useful

March 10, 2014

Stat of the Week Competition: March 8 – 14 2014

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday March 14 2014.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of March 8 – 14 2014 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

March 9, 2014

Briefly

  • Rafa, at Simply Statistics, shows that countries with higher GDP per capita also tend to have had women voting for longer. Yes, he does know about correlation and causation.
  • Felix Salmon writes about, essentially, Bayesian updating given conflicting information the probability that Dorian is Satoshi would seem to be very small, and the the probability that Dorian is not Satoshi would seem to be just as small — and yet, somehow, when you add the two probabilities together, the total needs to come to something close to 100%.
  • Viz for a cause, an new archive of for data visualisations advocating on various causes. The current examples come from Tableau Public, which might be worth a look for online displays.
  • Andrew Gelman onHow much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?” You can tell my answer from StatsChat. When it’s just consenting scientists in the journals, I’ve got better things to do. When there’s enough PR applied to get it into the NZ media, I try to respond. Sometimes it’s bad science; often it’s perfectly good underlying science and bad press releases. Remember, almost nothing from the scientific literature gets into the papers accidentally. Someone — the scientist, the journal, the university — has to push.
March 7, 2014

Careers in statistics

From Science Careers

“[The Bureau of Labor Statistics] projects that statistics jobs will grow 27% from 2012 to 2022, putting the profession in the “much faster than the average for all occupations” growth category. The bureau puts statisticians’ median annual salary in 2012 at $75,560.

In addition to having a different quote from Hal Varian than the one you were expecting, they talk to statisticians including Xihong Lin and Montse Fuentes.

Graphics design rules

1. Barcharts must start at zero,  from Storytelling with Data

2. Infographics as a proxy for overall news quality (barcharts must start at zero), from The Functional Art

3. And, from Storytelling with Data, perhaps the worst use of colour ever in donut charts. Statisticians keep saying it’s hard to compare pie/donut charts reliably. Notice how the two donuts below look very similar? Now try looking at the legends

donut

Remember: U and DON’T makes DONUT.

March 6, 2014

Attack of the killer lamb?

Not, not that one, the story about eating meat.

Stuff has the more egregious version “Eating meat ‘as bad as smoking‘”, the Herald has the rather better “Protein packed diet nearly as bad as smoking – expert”.

First, the good bits. Both stories are better than the UK versions: the Herald talks to Australian experts and brings in a related study; the Fairfax story at least mentions an outside scientific opinion and gives a link (though it’s to the university press release, which doesn’t link further to the research paper).

The researchers compared people who ate high-protein diet (just under 20% of the people) to those who ate a low-protein diet (just over 5%), and found a 70% higher rate of death in the high-protein group, in people aged 55-64.  The study was observational, but it was in a representative sample of the US and was backed up by experiments in mice. That’s not completely reliable,  but it is a big step.

The 70%-higher-rate of death for  high-protein vs low-protein diets compares to slightly over 100% higher rate for current smokers vs non-smokers in previous research using data from the same survey. You could get away with calling that ‘nearly as bad’, especially as other surveys have tended to give smaller differences. So, the Herald’s headline is defensible. Stuff’s headline drops the ‘nearly’, the ‘packed’ and refers to ‘meat’ rather than ‘protein’. It would be easy for a casual reader to get the false impression that the research had found eating meat was as bad as smoking.

There are two really big holes in the coverage, though.  The Herald alludes to one of them but doesn’t follow up 

People on high-protein diets are likely to lose years of life along with the weight they shed, according to two studies.

All the statistical analyses in the paper attempted to control for weight, ie, they were trying to compare people on high and low protein diets with the same weight. That’s not the relevant question for many people on these diets — the attraction of the diet is that it’s easier to lose weight.  The relevant question for them is a comparison between a high-protein diet with lower weight or a low-protein diet with higher weight.  That question could have been addressed with the data, but it wasn’t.

A rather less subtle omission is that neither story, nor the press release, mentions a key point of the paper: that the association reverses in people over 65.