June 8, 2014

Briefly

June 5, 2014

NZ interactive graphic examples

 

  • From The Wireless, a story with maps of voter turnout and registration rates for younger people (RadioNZ might not be where you expect interactive graphics, but there it is). If I were being picky, I would say the popup labels are too big relative to the size of the map window.

Gender, coding, and measurement error

Alyssa Frazee, a PhD student in biostatistics at Johns Hopkins, has an interesting post looking at gender of programmers using the Github code repository. Github users have a profile, which includes a first name, and there programs that attempt to classify first names by gender.

This graph (click to embiggen, as usual) shows the guessed gender distribution for software with at least five ‘stars’ (likes, sort of) across programming languages. Orange is male, green is female, grey is “don’t know”

coder-gender

The main message is obvious. Women either aren’t putting code on Github or are using non-gender-revealing or male-associated names.

The other point is that the language with the most female coders seems to be R, the statistical programming language originally developed in Auckland, which has 5.5%.  Sadly, 3.9% of that is code by the very prolific Hadley Wickham (also originally developed in Auckland), who isn’t female. Measurement error, as I’ve written before, has a much bigger impact on rare categories than common ones.

June 4, 2014

NRL Predictions for Round 13

Team Ratings for Round 13

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 8.18 12.35 -4.20
Rabbitohs 6.54 5.82 0.70
Cowboys 5.41 6.01 -0.60
Bulldogs 5.12 2.46 2.70
Sea Eagles 3.52 9.10 -5.60
Broncos 3.51 -4.69 8.20
Warriors 2.57 -0.72 3.30
Storm 1.40 7.64 -6.20
Panthers 1.34 -2.48 3.80
Knights -2.01 5.23 -7.20
Titans -2.18 1.45 -3.60
Wests Tigers -5.17 -11.26 6.10
Raiders -5.71 -8.99 3.30
Sharks -6.52 2.32 -8.80
Eels -7.34 -18.45 11.10
Dragons -10.46 -7.57 -2.90

 

Performance So Far

So far there have been 91 matches played, 51 of which were correctly predicted, a success rate of 56%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Panthers vs. Eels May 30 38 – 12 10.30 TRUE
2 Roosters vs. Raiders May 31 26 – 12 19.50 TRUE
3 Cowboys vs. Storm May 31 22 – 0 5.50 TRUE
4 Warriors vs. Knights Jun 01 38 – 18 6.60 TRUE
5 Broncos vs. Sea Eagles Jun 01 36 – 10 -0.00 FALSE
6 Rabbitohs vs. Dragons Jun 02 29 – 10 22.20 TRUE

 

Predictions for Round 13

Here are the predictions for Round 13. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Sea Eagles vs. Bulldogs Jun 06 Sea Eagles 2.90
2 Eels vs. Cowboys Jun 06 Cowboys -8.30
3 Titans vs. Panthers Jun 07 Titans 1.00
4 Dragons vs. Sharks Jun 07 Dragons 0.60
5 Rabbitohs vs. Warriors Jun 07 Rabbitohs 8.50
6 Knights vs. Wests Tigers Jun 08 Knights 7.70
7 Storm vs. Roosters Jun 08 Roosters -2.30
8 Raiders vs. Broncos Jun 09 Broncos -4.70

 

How much disagreement should there be?

The Herald

Thousands of school students are being awarded the wrong NCEA grades, a review of last year’s results has revealed.

Nearly one in four grades given by teachers for internally marked work were deemed incorrect after checking by New Zealand Qualifications Authority moderators.

That’s not actually true, because moderators don’t deem grades to be incorrect. That’s not what moderators are for.  What the report says (pp105-107 in case you want to scroll through it) is that in 24% of cases the moderator and the internal assessor disagreed on grade, and in 12% they disagreed on whether the standard had been achieved.

What we don’t know is how much disagreement is appropriate. The only way the moderator’s assessment could be considered error-free is if you define the ‘right answer’ to be ‘whatever the moderator says’, which is obviously not appropriate. There always will be some variation between moderators, and some variation between schools, and what we want to know is whether there is too much.

The report is a bit disappointing from that point of view.  At the very least, there should have been some duplicate moderation. That is, some pieces of work should have been sent to two different moderators, so we could have an idea of the between-moderator agreement rate. Then, if we were willing to assume that moderators collectively were infallible (though not individually), we could estimate how much less reliable the internal assessments were.

Even better would be to get some information on how much variation there is between schools in the disagreement: if there is very little variation, the schools may be doing about as well as is possible, but if there is a lot of variation between schools it would suggest some schools aren’t assessing very reliably.

 

June 3, 2014

Are girl hurricanes less scary?

There’s a new paper out in the journal PNAS claiming that hurricanes with female names cause three times as many deaths as those with male names (because people don’t give girl hurricanes the proper respect). Ed Yong does a good job of explaining why this is probably bogus, but no-one seems to have drawn any graphs, which I think make the situation a lot clearer. (more…)

June 2, 2014

Stat of the Week Competition: May 31 – June 6 2014

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday June 6 2014.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of May 31 – June 6 2014 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

May 30, 2014

Trusting your data or your model

Even with large amounts of data, automated predictions must usually incorporate explicit or implicit prior understanding of the structure of the problem. “Look for anything” is not good enough: “anything” is too big.

Here, for your weekend light entertainment, are some examples where the prior structure was too strong or too weak:

The example that prompted this post, from the blog of Melville House Press, is about automated scanning of books to create digital editions

 in many old texts the scanner is reading the word ‘arms’ as ‘anus’ and replacing it as such in the digital edition. As you can imagine, you don’t want to be getting those two things mixed up.

A similar phenomenon was pointed out at Language Log a decade ago

Fear not your toes, though they are strong,
The conquest doth to you belong;

Daniel Dennett recounts two anecdotes of speech recognition, one human and one computer, which err in the opposite direction to the text recognition example. The computer one:

An AI speech-understanding system whose development was funded by DARPA (Defense Advanced Research Projects Agency), was being given its debut before the Pentagon brass at Carnegie Mellon University some years ago. To show off the capabilities of the system, it had been attached as the “front end” or “user interface” on a chess-playing program. The general was to play white, and it was explained to him that he should simply tell the computer what move he wanted to make. The general stepped up to the mike and cleared his throat–which the computer immediately interpreted as “Pawn to King-4.” 

And, the example that is frustratingly familiar to so many of us: mobile phone autocorrupt, which you can search for yourself.

Levels of evidence

If you find that changing your diet in some way makes you feel happier and healthier, that’s a good thing.  It doesn’t matter whether the same change would be useful for most people, or only useful for you. It doesn’t matter whether the change is a placebo effect. It doesn’t even matter if it’s an illusion, a combination of regression to the mean and confirmation bias. You might check with a doctor or dietician as to whether the change is dangerous, but otherwise, go for it.

If you want to campaign for the entire community to make a change in their diet, you need to have evidence that it’s better on average for the entire community. A few people’s subjective experience isn’t good enough.  Good quality observational data might be all you can manage if the benefits are subtle or take years to appear, but if you’re claiming dramatic short-term benefits you should be able to demonstrate them in a randomised controlled trial.

The reason for mentioning this is that PETA has been making friends again. They’re trying to link milk consumption to autism. They don’t even pretend to have any evidence that milk causes autism, and the evidence that milk-free diet has a beneficial effect in people with autism is very weak.  That is, there are a few studies that suggest a benefit, but the benefit is smaller in studies with more reliable designs, and absent in the best-designed studies.  The most recent review of the evidence concluded that dairy-free or gluten-free diets should only be tried for people who have some separate evidence of food intolerance.  After reading the review, I would agree.

There are respectable arguments against dairy farming, both ethical and environmental. Scaremongering about autism isn’t one of them.

May 29, 2014

Lede program at Columbia

Columbia University in New York is running an amazing-looking data journalism certificate called The Lede Program. The program director is Cathy O’Neill of mathbabe.org and Occupy Finance,  and the program advisor is Mark Hansen, statistician, computational scientist, and artist.

Anyway, their syllabus (and quite a bit of other content) is available on Github.

I’d like to quote a course outline by Cathy O’Neill

This course begins with the idea that computing tools are the products of human ingenuity and effort. They are never neutral and carry with them the biases of their designers and their design process. “Platform studies” is a new term used to describe investigations into these relationships between computing technologies and the creative or research products that they help to generate. How you understand how data, code, and algorithms affect creative practices can be an effective first step toward critical thinking about technology.