Stats Chat Stats Chat

June 8, 2014

Briefly

By Thomas Lumley

A Lotto story in the Herald that isn’t untrue or importantly misleading. Yay!
The Art of Risk: videos from Leeds University
“Incidentalomas“: the problem with new medical screening technologies

June 5, 2014

NZ interactive graphic examples

By Thomas Lumley

From MBIE, the Regional Economic Activity Report is much less soporific than you’d imagine, thanks to interactive graphics (I believe by Dragonfly Science).

From The Wireless, a story with maps of voter turnout and registration rates for younger people (RadioNZ might not be where you expect interactive graphics, but there it is). If I were being picky, I would say the popup labels are too big relative to the size of the map window.

View comments (3)

Gender, coding, and measurement error

By Thomas Lumley

Alyssa Frazee, a PhD student in biostatistics at Johns Hopkins, has an interesting post looking at gender of programmers using the Github code repository. Github users have a profile, which includes a first name, and there programs that attempt to classify first names by gender.

This graph (click to embiggen, as usual) shows the guessed gender distribution for software with at least five ‘stars’ (likes, sort of) across programming languages. Orange is male, green is female, grey is “don’t know”

The main message is obvious. Women either aren’t putting code on Github or are using non-gender-revealing or male-associated names.

The other point is that the language with the most female coders seems to be R, the statistical programming language originally developed in Auckland, which has 5.5%. Sadly, 3.9% of that is code by the very prolific Hadley Wickham (also originally developed in Auckland), who isn’t female. Measurement error, as I’ve written before, has a much bigger impact on rare categories than common ones.

View comments (3)

June 4, 2014

NRL Predictions for Round 13

By David Scott

Team Ratings for Round 13

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

	Current Rating	Rating at Season Start	Difference
Roosters	8.18	12.35	-4.20
Rabbitohs	6.54	5.82	0.70
Cowboys	5.41	6.01	-0.60
Bulldogs	5.12	2.46	2.70
Sea Eagles	3.52	9.10	-5.60
Broncos	3.51	-4.69	8.20
Warriors	2.57	-0.72	3.30
Storm	1.40	7.64	-6.20
Panthers	1.34	-2.48	3.80
Knights	-2.01	5.23	-7.20
Titans	-2.18	1.45	-3.60
Wests Tigers	-5.17	-11.26	6.10
Raiders	-5.71	-8.99	3.30
Sharks	-6.52	2.32	-8.80
Eels	-7.34	-18.45	11.10
Dragons	-10.46	-7.57	-2.90

Performance So Far

So far there have been 91 matches played, 51 of which were correctly predicted, a success rate of 56%.

Here are the predictions for last week’s games.

	Game	Date	Score	Prediction	Correct
1	Panthers vs. Eels	May 30	38 – 12	10.30	TRUE
2	Roosters vs. Raiders	May 31	26 – 12	19.50	TRUE
3	Cowboys vs. Storm	May 31	22 – 0	5.50	TRUE
4	Warriors vs. Knights	Jun 01	38 – 18	6.60	TRUE
5	Broncos vs. Sea Eagles	Jun 01	36 – 10	-0.00	FALSE
6	Rabbitohs vs. Dragons	Jun 02	29 – 10	22.20	TRUE

Predictions for Round 13

Here are the predictions for Round 13. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

	Game	Date	Winner	Prediction
1	Sea Eagles vs. Bulldogs	Jun 06	Sea Eagles	2.90
2	Eels vs. Cowboys	Jun 06	Cowboys	-8.30
3	Titans vs. Panthers	Jun 07	Titans	1.00
4	Dragons vs. Sharks	Jun 07	Dragons	0.60
5	Rabbitohs vs. Warriors	Jun 07	Rabbitohs	8.50
6	Knights vs. Wests Tigers	Jun 08	Knights	7.70
7	Storm vs. Roosters	Jun 08	Roosters	-2.30
8	Raiders vs. Broncos	Jun 09	Broncos	-4.70

How much disagreement should there be?

By Thomas Lumley

The Herald

Thousands of school students are being awarded the wrong NCEA grades, a review of last year’s results has revealed.

Nearly one in four grades given by teachers for internally marked work were deemed incorrect after checking by New Zealand Qualifications Authority moderators.

That’s not actually true, because moderators don’t deem grades to be incorrect. That’s not what moderators are for. What the report says (pp105-107 in case you want to scroll through it) is that in 24% of cases the moderator and the internal assessor disagreed on grade, and in 12% they disagreed on whether the standard had been achieved.

What we don’t know is how much disagreement is appropriate. The only way the moderator’s assessment could be considered error-free is if you define the ‘right answer’ to be ‘whatever the moderator says’, which is obviously not appropriate. There always will be some variation between moderators, and some variation between schools, and what we want to know is whether there is too much.

The report is a bit disappointing from that point of view. At the very least, there should have been some duplicate moderation. That is, some pieces of work should have been sent to two different moderators, so we could have an idea of the between-moderator agreement rate. Then, if we were willing to assume that moderators collectively were infallible (though not individually), we could estimate how much less reliable the internal assessments were.

Even better would be to get some information on how much variation there is between schools in the disagreement: if there is very little variation, the schools may be doing about as well as is possible, but if there is a lot of variation between schools it would suggest some schools aren’t assessing very reliably.

View comments (1)

June 3, 2014

Are girl hurricanes less scary?

By Thomas Lumley

There’s a new paper out in the journal PNAS claiming that hurricanes with female names cause three times as many deaths as those with male names (because people don’t give girl hurricanes the proper respect). Ed Yong does a good job of explaining why this is probably bogus, but no-one seems to have drawn any graphs, which I think make the situation a lot clearer. (more…)

June 2, 2014

Stat of the Week Competition: May 31 – June 6 2014

By Rachel Cunliffe

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday June 6 2014.
Statistics can be bad, exemplary or fascinating.
The statistic must be in the NZ media during the period of May 31 – June 6 2014 inclusive.
Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

View comments (1)

May 30, 2014

Trusting your data or your model

By Thomas Lumley

Even with large amounts of data, automated predictions must usually incorporate explicit or implicit prior understanding of the structure of the problem. “Look for anything” is not good enough: “anything” is too big.

Here, for your weekend light entertainment, are some examples where the prior structure was too strong or too weak:

The example that prompted this post, from the blog of Melville House Press, is about automated scanning of books to create digital editions

in many old texts the scanner is reading the word ‘arms’ as ‘anus’ and replacing it as such in the digital edition. As you can imagine, you don’t want to be getting those two things mixed up.

A similar phenomenon was pointed out at Language Log a decade ago

Fear not your toes, though they are strong,
The conquest doth to you belong;

Daniel Dennett recounts two anecdotes of speech recognition, one human and one computer, which err in the opposite direction to the text recognition example. The computer one:

An AI speech-understanding system whose development was funded by DARPA (Defense Advanced Research Projects Agency), was being given its debut before the Pentagon brass at Carnegie Mellon University some years ago. To show off the capabilities of the system, it had been attached as the “front end” or “user interface” on a chess-playing program. The general was to play white, and it was explained to him that he should simply tell the computer what move he wanted to make. The general stepped up to the mike and cleared his throat–which the computer immediately interpreted as “Pawn to King-4.”

And, the example that is frustratingly familiar to so many of us: mobile phone autocorrupt, which you can search for yourself.

Levels of evidence

By Thomas Lumley

If you find that changing your diet in some way makes you feel happier and healthier, that’s a good thing. It doesn’t matter whether the same change would be useful for most people, or only useful for you. It doesn’t matter whether the change is a placebo effect. It doesn’t even matter if it’s an illusion, a combination of regression to the mean and confirmation bias. You might check with a doctor or dietician as to whether the change is dangerous, but otherwise, go for it.

If you want to campaign for the entire community to make a change in their diet, you need to have evidence that it’s better on average for the entire community. A few people’s subjective experience isn’t good enough. Good quality observational data might be all you can manage if the benefits are subtle or take years to appear, but if you’re claiming dramatic short-term benefits you should be able to demonstrate them in a randomised controlled trial.

The reason for mentioning this is that PETA has been making friends again. They’re trying to link milk consumption to autism. They don’t even pretend to have any evidence that milk causes autism, and the evidence that milk-free diet has a beneficial effect in people with autism is very weak. That is, there are a few studies that suggest a benefit, but the benefit is smaller in studies with more reliable designs, and absent in the best-designed studies. The most recent review of the evidence concluded that dairy-free or gluten-free diets should only be tried for people who have some separate evidence of food intolerance. After reading the review, I would agree.

There are respectable arguments against dairy farming, both ethical and environmental. Scaremongering about autism isn’t one of them.

May 29, 2014

Lede program at Columbia

By Thomas Lumley

Columbia University in New York is running an amazing-looking data journalism certificate called The Lede Program. The program director is Cathy O’Neill of mathbabe.org and Occupy Finance, and the program advisor is Mark Hansen, statistician, computational scientist, and artist.

Anyway, their syllabus (and quite a bit of other content) is available on Github.

I’d like to quote a course outline by Cathy O’Neill

This course begins with the idea that computing tools are the products of human ingenuity and effort. They are never neutral and carry with them the biases of their designers and their design process. “Platform studies” is a new term used to describe investigations into these relationships between computing technologies and the creative or research products that they help to generate. How you understand how data, code, and algorithms affect creative practices can be an effective first step toward critical thinking about technology.

Stats Chat

Briefly

NZ interactive graphic examples

Gender, coding, and measurement error

NRL Predictions for Round 13

Team Ratings for Round 13

Performance So Far

Predictions for Round 13

How much disagreement should there be?

Are girl hurricanes less scary?

Stat of the Week Competition: May 31 – June 6 2014

Trusting your data or your model

Levels of evidence

Lede program at Columbia

Recent comments

Popular posts

Latest posts

All topics

Recommended sites

Subscribe:

Receive our posts via email:

Team Ratings for Round 13

Performance So Far

Predictions for Round 13

Recent comments

Popular posts

Latest posts

All topics

Recommended sites