March 25, 2014

On a scale of 1 to 10

Via @neil_, an interactive graph of ratings for episodes of The Simpsons

simpsons

 

This comes from graphtv, which lets you do this for all sorts of shows (eg, Breaking Bad, which strikingly gets better ratings as the season progresses, then resets)

The reason the Simpsons graph has extra relevance to StatsChat is the distinctive horizontal line.  For the first ten seasons an episode basically couldn’t get rated below 7.5, after that it basically couldn’t rated above 7.5.   In the beginning there were ‘typical’ episodes and ‘good’ episodes; now there are ‘typical’ episodes and ‘bad’ episodes.

This could be a real change in quality, but it doesn’t match up neatly with the changes in personnel and style.  It could be a change in the people giving the ratings, or in the interpretation of the scale over time. How could we tell? One clue is that (based on checking just a handful of points) in the early years the high-rating episodes were rated by more people, and this difference has vanished or even reversed.

March 24, 2014

Briefly

  • Data visualisation: summary of  street grid angles in various US cities. The Houston one is a bit misleading because the highways are so dominant in reality but not in the summary.

Stat of the Week Competition: March 22 – 28 2014

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday March 28 2014.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of March 22 – 28 2014 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

March 22, 2014

Facts and values

In a rant against `data journalism’ in general and fivethirtyeight.com in particular, Leon Wieseltier writes in the New Republic

Many of the issues that we debate are not issues of fact but issues of value. There is no numerical answer to the question of whether men should be allowed to marry men, and the question of whether the government should help the weak, and the question of whether we should intervene against genocide. And so the intimidation by quantification practiced by Silver and the other data mullahs must be resisted. Up with the facts! Down with the cult of facts! 

There are questions of values that are separate from questions of fact, even if the philosopher Hume went too far in declaring “no ‘ought’ deducible from ‘is'”.   There may even be things we should or should not do regardless of the consequences. Mostly, though, our decisions should depend on the consequences.

We should help the weak. That’s a value held by most of us and not subject to factual disproof.  How we should do it is more complicated.  How much money should be spent? How much should we make people do to prove they need help? Is it better to give people money or vouchers for specific goods and services? Is it better to make more good jobs available or to give more help to those who can’t get them?  How much does participating in small social and political community groups or supporting independent radical writers and thinkers help versus putting the same effort into paying lobbyists or donating to political parties or individual candidates? Is it important to restrict wealth and power of small elites, and what costs are worth paying to do so? How much discretion should be given to police and the judiciary to go lightly on the weak, and how much should they  be given strict rules to stop them going lightly on the strong? Is a minimum wage increase better than a low-income subsidy? Are the weak better off if we have a tax system that’s not very progressive in theory but it hard for the rich and powerful to evade?

As soon as you want to do something, rather than just have good intentions about it, the consequences of your actions matter, and you have a moral responsibility to find out what those consequences are likely to be.

Polls and role-playing games

An XKCD classic

sports

 

The mouseover text says “Also, all financial analysis. And, more directly, D&D.” 

We’re getting to the point in the electoral cycle where opinion polls qualify as well. There will be lots of polls, and lots media and blog writing that tries to tell stories about the fluctuations from poll to poll that fit in with their biases or their need to sell advertising. So, as an aid to keeping calm and believing nothing, I thought a reminder about variability would be useful.

The standard NZ opinion poll has 750-1000 people. The ‘maximum margin of error’ is about 3.5% for 730 and about 3% for 1000. If the poll is of a different size, they will usually quote the maximum margin of error. If you have 20 polls, 19 of them should get the overall left:right division to within the maximum margin of error.

If you took 3.5% from the right-wing coalition and moved it to the left-wing coalition, or vice versa, you’d change the gap between them by 7% and get very different election results, so getting this level of precision 19 times out of 20 isn’t actually all that impressive unless you consider how much worse it could be. And in fact, polls likely do a bit worse than this: partly because voting preferences really do change, partly because people lie, and partly because random sampling is harder than it looks.

Often, news headlines are about changes in a poll, not about a single poll. The uncertainty in a change is  higher than in a single value, because one poll might have been too low and the next one too high.  To be precise, the uncertainty is 1.4 times higher for a change.  For a difference between two 750-person polls, the maximum margin of error is about 5%.

You might want a less-conservative margin than 19 out of 20. The `probable error’ is the error you’d expect half the time. For a 750-person poll the probable error is 1.3% for a single party and single poll,  2.6% for the difference between left and right in a single poll, and 1.9% for a difference between two polls for the same major party.

These are all for major parties.  At the 5% MMP threshold the margin of error is smaller: you can be pretty sure a party polling below 3.5% isn’t getting to the threshold and one polling about 6.5% is, but that’s about it.

If a party gets an electorate seat and you want to figure out if they are getting a second List seat, a national poll is not all that helpful. The data are too sparse, and the random sampling is less reliable because minor parties tend to have more concentrated support.   At 2% support the margin of error for a single poll is about 1% each way.

Single polls are not very useful, but multiple polls are much better, as the last US election showed. All the major pundits who used sensible averages of polls were more accurate than essentially everyone else.  That’s not to say experts opinion is useless, just that if you have to pick just one of statistical voodoo and gut instinct, statistics seems to work better.

In NZ there are several options. Peter Green does averages that get posted at Dim Post; his code is available. KiwiPollGuy does averages and also writes about the iPredict betting markets, and pundit.co.nz has a Poll of Polls. These won’t work quite as well as in the US, because the US has an insanely large number of polls and elections to calibrate them, but any sort of average is a big improvement over looking one poll at a time.

A final point: national polls tell you approximately nothing about single-electorate results. There’s just no point even looking at national polling results for ACT or United Future if you care about Epsom or Ohariu.

March 21, 2014

Common exposures are common

A California head-lice treatment business has had huge success in publicising its business with the claim that selfies are causing a  rise in nits among teenagers. The Herald mentions this in Sideswipe, the right place for this sort of story, but other international sites have been less discriminating.

There are no actual numbers involved, and nothing like representative data even if you’re in the South Bay area of central California. More importantly, though, there is no comparison group. The owner of the business, Mary MacQuillan, says “Every teen I’ve treated, I ask about selfies, and they admit that they are taking them every day.”  That’s probably only a slight exaggeration at most, but every teen she hasn’t treated has also probably been taking photos that way. It’s something teenagers do.  Common exposures are common.

So, why were news organisations around the world publicising this? The fact that it’s about teenagers and the internet goes a long way to explaining it.  It doesn’t need evidence because teenage use of technology is automatically scary and newsworthy: as Ms MacQuillan says ” I think parents need to be aware, and teenagers need to be aware too. Selfies are fun, but the consequences are real.”

You get the same thing happening with ‘chemicals’, as the dihydrogen monoxide parody website loves to point out

A recent stunning revelation is that in every single instance of violence in our country’s schools, …, dihydrogen monoxide was involved.

 

March 20, 2014

Beyond the margin of error

From Twitter, this morning (the graphs aren’t in the online story)

Now, the Herald-Digipoll is supposed to be a real survey, with samples that are more or less representative after weighting. There isn’t a margin of error reported, but the standard maximum margin of error would be  a little over 6%.

There are two aspects of the data that make it not look representative. Thr first is that only 31.3%, or 37% of those claiming to have voted, said they voted for Len Brown last time. He got 47.8% of the vote. That discrepancy is a bit larger than you’d expect just from bad luck; it’s the sort of thing you’d expect to see about 1 or 2 times in 1000 by chance.

More impressively, 85% of respondents claimed to have voted. Only 36% of those eligible in Auckland actually voted. The standard polling margin of error is ‘two sigma’, twice the standard deviation.  We’ve seen the physicists talk about ‘5 sigma’ or ‘7 sigma’ discrepancies as strong evidence for new phenomena, and the operations management people talk about ‘six sigma’ with the goal of essentially ruling out defects due to unmanaged variability.  When the population value is 36% and the observed value is 85%, that’s a 16 sigma discrepancy.

The text of the story says ‘Auckland voters’, not ‘Aucklanders’, so I checked to make sure it wasn’t just that 12.4% of the people voted in the election but didn’t vote for mayor. That explanation doesn’t seem to work either: only 2.5% of mayoral ballots were blank or informal. It doesn’t work if you assume the sample was people who voted in the last national election.  Digipoll are a respectable polling company, which is why I find it hard to believe there isn’t a simple explanation, but if so it isn’t in the Herald story. I’m a bit handicapped by the fact that the University of Texas internet system bizarrely decides to block the Digipoll website.

So, how could the poll be so badly wrong? It’s unlikely to just be due to bad sampling — you could do better with a random poll of half a dozen people. There’s got to be a fairly significant contribution from people whose recall of the 2013 election is not entirely accurate, or to put it more bluntly, some of the respondents were telling porkies.  Unfortunately, that makes it hard to tell if results for any of the other questions bear even the slightest relationship to the truth.

 

 

 

March 19, 2014

Revised Super 15 Predictions for Round 6

I had a mistake in my code so that the country assigned to the Sharks was incorrect so the previously posted prediction for the Bulls versus Sharks game was wrongly calculated. I now have the Sharks to win by 0.10 point.

All the other predictions for the round are unchanged.

Team Ratings for Round 6

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sharks 6.52 4.57 2.00
Crusaders 6.07 8.80 -2.70
Chiefs 5.30 4.38 0.90
Brumbies 4.19 4.12 0.10
Bulls 3.92 4.87 -1.00
Waratahs 3.65 1.67 2.00
Stormers 1.95 4.38 -2.40
Hurricanes -0.22 -1.44 1.20
Reds -0.28 0.58 -0.90
Blues -1.90 -1.92 0.00
Cheetahs -3.35 0.12 -3.50
Highlanders -3.85 -4.48 0.60
Lions -4.15 -6.93 2.80
Force -4.65 -5.37 0.70
Rebels -6.20 -6.36 0.20

 

Performance So Far

So far there have been 29 matches played, 20 of which were correctly predicted, a success rate of 69%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Chiefs vs. Stormers Mar 14 36 – 20 6.10 TRUE
2 Rebels vs. Crusaders Mar 14 19 – 25 -8.70 TRUE
3 Hurricanes vs. Cheetahs Mar 15 60 – 27 3.80 TRUE
4 Highlanders vs. Force Mar 15 29 – 31 5.80 FALSE
5 Brumbies vs. Waratahs Mar 15 28 – 23 2.70 TRUE
6 Lions vs. Blues Mar 15 39 – 36 1.50 TRUE
7 Sharks vs. Reds Mar 15 26 – 6 7.80 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Hurricanes Mar 21 Hurricanes -1.10
2 Waratahs vs. Rebels Mar 21 Waratahs 12.40
3 Blues vs. Cheetahs Mar 22 Blues 5.40
4 Brumbies vs. Stormers Mar 22 Brumbies 6.20
5 Force vs. Chiefs Mar 22 Chiefs -5.90
6 Lions vs. Reds Mar 22 Lions 0.10
7 Bulls vs. Sharks Mar 22 Sharks -0.10

 

NRL Predictions for Round 3

Team Ratings for Round 3

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 12.30 12.35 -0.00
Rabbitohs 8.09 5.82 2.30
Sea Eagles 8.07 9.10 -1.00
Storm 7.19 7.64 -0.40
Cowboys 4.04 6.01 -2.00
Bulldogs 3.73 2.46 1.30
Knights 1.16 5.23 -4.10
Panthers 0.84 -2.48 3.30
Titans -1.49 1.45 -2.90
Sharks -1.60 2.32 -3.90
Broncos -2.37 -4.69 2.30
Dragons -4.18 -7.57 3.40
Warriors -5.81 -0.72 -5.10
Raiders -5.86 -8.99 3.10
Wests Tigers -8.36 -11.26 2.90
Eels -17.54 -18.45 0.90

 

Performance So Far

So far there have been 16 matches played, 6 of which were correctly predicted, a success rate of 37.5%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Sea Eagles vs. Rabbitohs Mar 14 14 – 12 5.20 TRUE
2 Broncos vs. Cowboys Mar 14 16 – 12 -3.40 FALSE
3 Warriors vs. Dragons Mar 15 12 – 31 7.40 FALSE
4 Storm vs. Panthers Mar 15 18 – 17 13.10 TRUE
5 Roosters vs. Eels Mar 15 56 – 4 30.50 TRUE
6 Titans vs. Wests Tigers Mar 16 12 – 42 19.40 FALSE
7 Knights vs. Raiders Mar 16 20 – 26 15.30 FALSE
8 Bulldogs vs. Sharks Mar 17 42 – 4 4.10 TRUE

 

Predictions for Round 3

Here are the predictions for Round 3. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Wests Tigers vs. Rabbitohs Mar 21 Rabbitohs -12.00
2 Broncos vs. Roosters Mar 21 Roosters -10.20
3 Panthers vs. Bulldogs Mar 22 Panthers 1.60
4 Sharks vs. Dragons Mar 22 Sharks 7.10
5 Cowboys vs. Warriors Mar 22 Cowboys 14.40
6 Sea Eagles vs. Eels Mar 23 Sea Eagles 30.10
7 Raiders vs. Titans Mar 23 Raiders 0.10
8 Storm vs. Knights Mar 24 Storm 10.50

 

Super 15 Predictions for Round 6

Team Ratings for Round 6

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sharks 6.52 4.57 2.00
Crusaders 6.07 8.80 -2.70
Chiefs 5.30 4.38 0.90
Brumbies 4.19 4.12 0.10
Bulls 3.92 4.87 -1.00
Waratahs 3.65 1.67 2.00
Stormers 1.95 4.38 -2.40
Hurricanes -0.22 -1.44 1.20
Reds -0.28 0.58 -0.90
Blues -1.90 -1.92 0.00
Cheetahs -3.35 0.12 -3.50
Highlanders -3.85 -4.48 0.60
Lions -4.15 -6.93 2.80
Force -4.65 -5.37 0.70
Rebels -6.20 -6.36 0.20

 

Performance So Far

So far there have been 29 matches played, 20 of which were correctly predicted, a success rate of 69%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Chiefs vs. Stormers Mar 14 36 – 20 6.10 TRUE
2 Rebels vs. Crusaders Mar 14 19 – 25 -8.70 TRUE
3 Hurricanes vs. Cheetahs Mar 15 60 – 27 3.80 TRUE
4 Highlanders vs. Force Mar 15 29 – 31 5.80 FALSE
5 Brumbies vs. Waratahs Mar 15 28 – 23 2.70 TRUE
6 Lions vs. Blues Mar 15 39 – 36 1.50 TRUE
7 Sharks vs. Reds Mar 15 26 – 6 7.80 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Hurricanes Mar 21 Hurricanes -1.10
2 Waratahs vs. Rebels Mar 21 Waratahs 12.40
3 Blues vs. Cheetahs Mar 22 Blues 5.40
4 Brumbies vs. Stormers Mar 22 Brumbies 6.20
5 Force vs. Chiefs Mar 22 Chiefs -5.90
6 Lions vs. Reds Mar 22 Lions 0.10
7 Bulls vs. Sharks Mar 22 Sharks -0.10