Posts filed under Politics (127)

October 7, 2014

Enumerating hard-to-reach populations

I’ve written before about how it’s hard to get accurate estimates of the size of small subpopulations, even with large, well-designed surveys.

Via the Herald

Mr Key said that was an emerging issue for New Zealand. “If I was to spell out to New Zealanders the exact number of people looking to leave and be foreign fighters, it would be larger, I think, than New Zealanders would expect that number to be.”

If the government really knows the ‘exact number’, there must have been a lot more domestic surveillance than we’ve been told about.

New Zealanders probably don’t have any very well formed expectations for that number, since we have basically no information to go on. My guess would be along the lines of “Not very many, but people are strange,  so probably some.” I’d be surprised if it were less than 10 or more than 1000.


October 6, 2014

NZ voting cartograms

One of the problems with electoral maps is the ‘one cow, one vote’ effect: rural electorates are physically bigger, and so take up more of the map. When you combine that with the winner-take-all impact of simple colour schemes, it can look as though National won basically everything instead of just missing out on a majority.

Using a design by Chris McDowall that I linked earlier this year, David Friggens has mapped out the party votes across the country with equal area given to each electorate.  These maps show where the votes for each major party came from



He also has maps for the minor parties, some of which have very localised support.

September 22, 2014

So, we had an election

Turnout of enrolled voters was up 3 percentage points over 2011, but enrollment was down, so as a fraction of the eligible population, turnout was only up half a percentage point.

From the Herald’s interactive, the remarkably boring trends through the count

There are a few electorates that are, arguably, still uncertain, but by 9pm the main real uncertainty at the nationwide level was whether Hone Harawira would win Te Tai Tokerau, and that wasn’t going to affect who was in government.  By 10pm it was pretty clear Harawira was out (though he hadn’t conceded) and that Internet Mana had been, in his opponent’s memorable phrase, “all steam and no hangi.”

Jonathan Marshall (@jmarshallnz) has posted swings in each electorate, for the party vote and electorate vote. He also has an interactive Sainte-Laguë seat allocation calculator and has published the data (complete apart from special votes) in a convenient form for y’all to play with.

David Heffernan (@kiwipollguy) collected a bunch of poll, poll average, and pundit predictions, and writes about them here. The basic summary is that they weren’t very good, though there weren’t any totally loony ones, as there were for the last US Presidential election. Our pundits seem to be moderately well calibrated to reality, but there’s a lot of uncertainty in the system and the improvement from averaging seems pretty small.  The only systematic bias is that the Greens did a bit worse than expected.

Based on his criterion, which is squared prediction error scaled basically by party vote, two single polls — 3 News/Reid at the high end and Herald Digipoll at the low end — spanned almost the entire range of prediction error.

The variation between predictions isn’t actually much bigger than you’d expect by chance. The prediction errors have the mean you’d expect from a random sample of about 400 people, and apart from two outliers they have the right spread as well. On the graph, the red curve is a chi-squared distribution with 9 degrees of freedom, and the black curve is the distribution of the 23 estimates. The outliers are Wikipedia and the last 3 News/Reid Research poll.


About half the predictions were qualitatively wrong: they had National needing New Zealand First or the Conservatives for a majority. The Conservatives were clearly treated unfairly by the MMP threshold. If someone is going to be, I’m glad it’s them, but a party with more votes than the Māori Party, Internet Mana, ACT, United Future, and Legalise Cannabis put together should have a chance to prove their unsuitability in Parliament.


September 18, 2014

Interactive election results map

The Herald has an interactive election-results map, which will show results for each polling place as they come in, together with demographic information about each electorate.  At the moment it’s showing the 2011 election data, and the displays are still being refined — but the Herald has started promoting it, so I figure it’s safe for me to link as well.

Mashblock is also developing an election site. At the moment they have enrolment data by age. Half the people under 35 in Auckland Central seem to be unenrolled,which is a bit scary. Presumably some of them are students enrolled at home, and some haven’t been in NZ long enough to enrol, but still.

Some non-citizens probably don’t know that they are eligible — I almost missed out last time. So, if you know someone who is a permanent resident and has lived in New Zealand for a year, you might just ask if they know about the eligibility rules. Tomorrow is the last day.

August 30, 2014

Funding vs disease burden: two graphics

You have probably seen the graphic from vox.comhyU8ohq


There are several things wrong with it. From a graphics point of view it doesn’t make any of the relevant comparisons easy. The diameter of the circle is proportional to the deaths or money, exaggerating the differences. And the donation data are basically wrong — the original story tries to make it clear that these are particular events, not all donations for a disease, but it’s the graph that is quoted.

For example, the graph lists $54 million for heart disease, based on the ‘Jump Rope for Heart’ fundraiser. According to Forbes magazine’s list of top charities, the American Heart Association actually received $511 million in private donations in the year to June 2012, almost ten times as much.  Almost as much again came in grants for heart disease research from the National Institutes of Health.

There’s another graph I’ve seen on Twitter, which shows what could have been done to make the comparisons clearer:



It’s limited, because it only shows government funding, not private charity, but it shows the relationship between funding and the aggregate loss of health and life for a wide range of diseases.

There are a few outliers, and some of them are for interesting reasons. Tuberculosis is not currently a major health problem in the US, but it is in other countries, and there’s a real risk that it could spread to the US.  AIDS is highly funded partly because of successful lobbying, partly because it — like TB — is a foreign-aid issue, and partly because it has been scientifically rewarding and interesting. COPD and lung cancer are going to become much less common in the future, as the victims of the century-long smoking epidemic die off.

Depression and injuries, though?


Update: here’s how distorted the areas are: the purple number is about 4.2 times the blue number


August 29, 2014

Getting good information to government

On the positive side: there’s a conference of science advisers and people who know about the field here in Auckland at the moment. There’s a blog, and there will soon be videos of the presentations.

On the negative side: Statistics Canada continues to provide an example of how a world-class official statistics agency can go downhill with budget cuts and government neglect.  The latest story is the report on how the Labour Force Survey (which is how unemployment is estimated) was off by 42000 in July. There’s a shorter writeup in Maclean’s magazine, and their archive of stories on StatsCan is depressing reading.

August 19, 2014

“More maps that won’t change your mind about racism in America”



Ultimately, despite the centrality of social media to the protests and our ability to come together and reflect on the social problems at the root of Michael Brown’s shooting, these maps, and the kind of data used to create them, can’t tell us much about the deep-seated issues that have led to the killing of yet another unarmed young black man in our country. And they almost certainly won’t change anyone’s mind about racism in America. They can, instead, help us to better understand how these events have been reflected on social media, and how even purportedly global news stories are always connected to particular places in specific ways.

August 8, 2014

History of NZ Parliament visualisation

One frame of a video showing NZ party representation in Parliament over time,


made by Stella Blake-Kelly for TheWireless. Watch (and read) the whole thing.

August 7, 2014

Non-bogus non-random polling

As you know, one of the public services StatsChat provides is whingeing about bogus polls in the media, at least when they are used to anchor stories rather than just being decorative widgets on the webpage. This attitude doesn’t (or doesn’t necessarily) apply to polls that make no effort to collect a non-random sample but do make serious efforts to reduce bias by modelling the data. Personally, I think it would be better to apply these modelling techniques on top of standard sampling approaches, but that might not be feasible. You can’t do everything.

I’ve been prompted to write this by seeing Andrew Gelman and David Rothschild’s reasonable and measured response (and also Andrew’s later reasonable and less measured response) to a statement from the American Association for Public Opinion Research.  The AAPOR said

This week, the New York Times and CBS News published a story using, in part, information from a non-probability, opt-in survey sparking concern among many in the polling community. In general, these methods have little grounding in theory and the results can vary widely based on the particular method used. While little information about the methodology accompanied the story, a high level overview of the methodology was posted subsequently on the polling vendor’s website. Unfortunately, due perhaps in part to the novelty of the approach used, many of the details required to honestly assess the methodology remain undisclosed.

As the responses make clear, the accusation about transparency of methods is unfounded. The accusation about theoretical grounding is the pot calling the kettle black.  Standard survey sampling theory is one of my areas of research. I’m currently writing the second edition of a textbook on it. I know about its grounding in theory.

The classical theory applies to most of my applied sampling work, which tends to involve sampling specimen tubes from freezers. The theoretical grounding does not apply when there is massive non-response, as in all political polling. It is an empirical observation based on election results that carefully-done quota samples and reweighted probability samples of telephones give pretty good estimates of public opinion. There is no mathematical guarantee.

Since classical approaches to opinion polling work despite massive non-response, it’s reasonable to expect that modelling-based approaches to non-probability data will also work, and reasonable to hope that they might even work better (given sufficient data and careful modelling). Whether they do work better is an empirical question, but these model-based approaches aren’t a flashy new fad. Rod Little, who pioneered the methods AAPOR is objecting to, did so nearly twenty years before his stint as Chief Scientist at the US Census Bureau, an institution not known for its obsession with the latest fashions.

In some settings modelling may not be feasible because of a lack of population data. In a few settings non-response is not a problem. Neither of those applies in US political polling. It’s disturbing when the president of one of the largest opinion-polling organisations argues that model-based approaches should not be referenced in the media, and that’s even before considering some of the disparaging language being used.

“Don’t try this at home” might have been a reasonable warning to pollers without access to someone like Andrew Gelman. “Don’t try this in the New York Times” wasn’t.

August 4, 2014

Predicting blood alcohol concentration is tricky

Rasmus Bååth, who is doing a PhD in Cognitive Science, in Sweden, has written a web app that predicts blood alcohol concentrations using reasonably sophisticated equations from the forensic science literature.

The web page gives a picture of the whole BAC curve over time, but requires a lot of detailed inputs. Some of these are things you could know accurately: your height and weight, exactly when you had each drink and what it was. Some of them you have a reasonable idea about: is your stomach empty or full, and therefore is alcohol absorption fast or slow. You also need to specify an alcohol elimination rate, which he says averages 0.018%/hour but could be half or twice that, and you have no real clue.

If you play around with the interactive controls, you can see why the advice given along with the new legal limits is so approximate (as Campbell Live is demonstrating tonight).  Rasmus has all sorts of disclaimers about how you shouldn’t rely on the app, so he’d probably be happier if you don’t do any more than that with it.