Posts filed under Politics (185)

September 20, 2017

Democracy is coming

Unless someone says something really annoyingly wrong about polling in the next few days, I’m going to stop commenting until Saturday night.

Some final thoughts:

  • The election looks closer than NZ opinion polling is able to discriminate. Anyone who thinks they know what the result will be is wrong.
  • The most reliable prediction based on polling data is that the next government will at least need confidence and supply from NZ First. Even that isn’t certain.
  • It’s only because of opinion polling that we know the election is close. It would be really surprising if Labour didn’t do a lot better than the 25% they managed in the 2014 election — but we wouldn’t know that without the opinion polls.



September 10, 2017

Why you can’t predict Epsom from polls

The Herald’s poll aggregator had a bit of a breakdown over the Epsom electorate yesterday, suggesting that Labour had a chance of winning.

Polling data (and this isn’t something a statistician likes saying) is essentially useless when it comes to Epsom, because neither side benefits from getting their own supporters’ votes. National supporters are a clear majority in the electorate. If they do their tactical voting thing properly and vote for ACT’s David Seymour, he will win.  If they do the tactical voting thing badly enough, and the Labour and Green voters do it much better, National’s Paul Goldsmith will win.

Opinion polls over the whole country don’t tell you about tactical voting strategies in Epsom. Even opinion polls in Epsom would have to be carefully worded, and you’d have to be less confident in the results.

There isn’t anywhere else quite like Epsom. There are other electorates that matter and are hard to predict — such as Te Tai Tokerau, where polling information on Hone Harawira’s popularity is sparse — but in those electorates the polls are at least asking the right question.

Peter Ellis’s poll aggregator just punts on this question: the probability of ACT winning Epsom is set at an arbitrary 80%, and he gives you an app that lets you play with the settings. I think that’s the right approach.

September 4, 2017

Before and after

We’re in the interesting situation this election where it looks like political preferences are actually changing quite rapidly (though some of this could be changes in non-response that don’t show up in actual voting).

On Thursday, One News released a poll by Colmar Brunton that found Labour ahead of National by 43% to 41% for the first time in years.  Yesterday, NewsHub released a Reid Research poll with Labour back behind National 39% to 43%.

“Released” is important here. The Colmar Brunton poll was taken over August 26-30. The Reid Research poll was taken over August 22-30. That is, despite being released  later, the Reid Research poll was (on average) taken earlier. Comments (and even analysis) of polls often ignore the interview time and focus on the release date, but here we can see why the code of conduct for pollers requires the interview period to be described.

A difference of 4 percentage points in Labour’s support is quite large for two polls of this size (though not out of the question just from sampling error). If the polls were really discrete events four days apart, it would be plausible to argue they showed Labour’s support had stopped increasing — that the Ardern effect had reached its limit. If the two polls were taken over exactly the same period, the most plausible conclusion would be that the true support was in between and that we knew nothing more about Labour’s trajectory. With the Sunday poll actually taken slightly earlier, the difference is still likely to mostly be noise, but to the (very limited) extent that it says anything about trajectory, the story is positive for Labour.

August 26, 2017

Successive approximations to understanding MMP

The MMP voting system and its implications are relatively complicated. I’m going to try to give simple approximations and then corrections to them. If you want more definitive details, here’s the Electoral Commission and the Electoral Act.

Two votes: You have an electorate vote, which only affects who your local MP is, and doesn’t affect the composition of Parliament. You also have a party vote that affects the composition of Parliament, but not who your local MP is. The number of seats a party gets in Parliament is proportional to the number of party votes it gets.

This isn’t true, but it’s actually a pretty good working approximation for most of us.

There are two obvious flaws. First, if your local MP belongs to a party that doesn’t get enough votes to have any seats in Parliament, they still get to be an MP. Peter Dunne in Ōhariu was an example of this in the 2014 election. Second, when working out the number of seats a party is entitled to in Parliament, parties with less than 5% of the vote are excluded unless they won some electorate.  In the 2014 election, the Conservative Party got 3.97% of the vote, but no seats.

The Māori Party was an example of both exceptions: they did get enough votes in proportional terms for two seats, but not enough to make the 5% cutoff, but they didn’t have to because Te Ururoa Flavell won the Waiāriki electorate seat for them.

Proportionality: There are 120 seats, so a party needs 1/120th, or about 0.83%, of the vote for each one.

That’s not quite true because of the 5% threshold, both because some parties miss out and because the relevant percentages are of the votes remaining after parties have been excluded by the threshold.

It’s also not true because of rounding.  We elect whole MPs, not fractional ones, so we need a rounding rule. Roughly speaking, half -seats round up. More accurately, suppose there is some number N of votes available per seat (which will be worked out later). If you have at least 0.5×N votes you get one seat, 1.5×N gets you two seats, 13.5×N gets you fourteen seats.  So what’s N? It’s roughly 1/121th (0.83%) of the votes; it’s exactly whatever number you need to allocate exactly as many seats as you have available. (The Electoral Commission actually uses a procedure that’s identical in effect to this one and easier to compute, but (I think) harder to explain).

In 2014, the Māori Party got 1.32% of the vote, which is a bit more than 1.5×0.83%, and were entitled to two seats. ACT got less than 0.83% but more than 0.5×0.83% and were entitled to one seat.

Finally, if a party gets more seats from electorate candidates than it is due by proportionality those seats are extra, above the 120-seat ideal size of Parliament — except that seats won by a party or individual not contesting the party vote do come out of the 120-seat total.  So, in 2014, ACT got enough party votes to be due one of the 120 seats, but United Future didn’t. United Future did contest the party vote so Peter Dunne’s seat did not come out of the 120-seat total — he was an ‘overhang’ 121st MP. I’m guessing the reason overhangs by parties contesting the party vote are extra is that you don’t know how many there will be until you’ve done the calculation, so you’d have to go back to the start and recalculate if you counted them in the 120 (which might change the number of over-allocated seats and force another recalculation and so on).

Māori Roll: People of Māori descent can choose, every five years, to be on a Māori electoral roll rather than the general roll. If enough of them do, Māori electorates are created with the same number of people as the general electorates. There are currently seven Māori electorates, representing just over half of the people of Māori descent.  As with any electorate, you don’t have to be enrolled there to stand there; anyone eligible to be an MP can stand. 

The main way this is oversimplified is because of the people of Māori descent who aren’t on either roll, because they’re too young or just not enrolled yet. You can’t tell whether they would be on the general roll or the Māori roll, so there are procedures for StatsNZ to split the non-enrolled Māori-descent population up to calculate electorate populations.

August 22, 2017

Deciding how to vote

There’s a bunch of web pages/apps out there that supposedly help you to decide who to vote for.

On the Fence: This one asks you to move a slider to ‘balance’ competing principles, then works out which party you agree with.

There are some obvious problems. First, the scale isn’t clearly calibrated.  If you’re at 50:50 on government vs private-sector roles in providing affordable housing, does that mean you think 50% of it should be state houses, or that it should all be state-owned but built by private sector construction companies, or something vague and woolly?

Second, as lots of people have pointed out, there’s some false dichotomies there, like the privacy:security tradeoff.

Perhaps more important, when there is a genuine tradeoff, it’s a genuine tradeoff. You typically can’t decide it by abstract principle without reference to the facts.

Vote Compass:  This one takes advantage of the empirical observation that people’s voting preferences compress fairly well into two dimensions.  The questions are much more clearly calibrated: eg, the affordable-housing one is “The government should build affordable housing for Kiwis to buy” with a ‘Strongly agree” to “Strongly disagree” scale.

Most usefully, there’s a tool for you to explore how your position differs from that of the parties on each of the questions, and to reweight the results depending on which issues you care about.  Annoyingly, there’s a category “Moral Issues” that includes marijuana legalisation but not the questions about refugees or climate-change or affordable housing

Policy: The Spinoff has a tool that seems philosophically different from the others. It has much more emphasis on comparing actual party policies and less on trying to find out what your ideal party would be. As a result, it’s less useful if you want to be told what you think, but might be more useful if you want to look at specific policies. Whether you do, I suppose, depends on how much you believe the policies — especially from the minor parties, where you’d need to know how the policies rank in their actual negotiating position for coalition or confidence & supply.

July 30, 2017

What are election polls trying to estimate? And is Stuff different?

Stuff has a new election ‘poll of polls’.

The Stuff poll of polls is an average of the most recent of each of the public political polls in New Zealand. Currently, there are only three: Roy Morgan, Colmar Brunton and Reid Research. 

When these companies release a new poll it replaces their previous one in the average.

The Stuff poll of polls differs from others by giving weight to each poll based on how recent it is.

All polls less than 36 days old get equal weight. Any poll 36-70 days old carries a weight of 0.67, 70-105 days old a weight 0.33 and polls greater than 105 days old carry no weight in the average.

In thinking about whether this is a good idea, we’d need to first think about what the poll is trying to estimate and about the reasons it doesn’t get that target quantity exactly right.

Officially, polls are trying to estimate what would happen “if an election were held tomorrow”, and there’s no interest in prediction for dates further forward in time than that. If that were strictly true, no-one would care about polls, since the results would refer only to the past two weeks when the surveys were done.

A poll taken over a two-week period is potentially relevant because there’s an underlying truth that, most of the time, changes more slowly than this.  It will occasionally change faster — eg, Donald Trump’s support in the US polls seems to have increased after James Comey’s claims about Clinton’s emails in the US, and Labour’s support in the UK polls increased after the election was called — but it will mostly change slower. In my view, that’s the thing people are trying to estimate, and they’re trying to estimate it because it has some medium-term predictive value.

In addition to changes in the underlying truth, there is the idealised sampling variability that pollsters quote as the ‘margin of error’. There’s also larger sampling variability that comes because polling isn’t mathematically perfect. And there are ‘house effects’, where polls from different companies have consistent differences in the medium to long term, and none of them perfectly match voting intentions as expressed at actual elections.

Most of the time, in New Zealand — when we’re not about to have an election — the only recent poll is a Roy Morgan poll, because  Roy Morgan polls more much often than anyone else.  That means the Stuff poll of polls will be dominated by the most recent Roy Morgan poll.  This would be a good idea if you thought that changes in underlying voting intention were large compared to sampling variability and house effects. If you thought sampling variability was larger, you’d want multiple polls from a single company (perhaps downweighted by time).  If you thought house effects were non-negligible, you wouldn’t want to downweight other companies’ older polls as aggressively.

Near an election, there are lots more polls, so the most recent poll from each company is likely to be recent enough to get reasonably high weight. The Stuff poll is then distinctive in that it complete drops all but the most recent poll from each company.

Recency weighting, however, isn’t at all unique to the Stuff poll of polls. For example, the poll of polls downweights older polls, but doesn’t drop the weight to zero once another poll comes out. Peter Ellis’s two summaries both downweight older polls in a more complicated and less arbitrary way; the same was true of Peter Green’s poll aggregation when he was doing it.  Curia’s average downweights even more aggressively than Stuff’s, but does not otherwise discard older polls by the same company. RadioNZ averages the only the four most recent available results (regardless of company) — they don’t do any other weighting for recency, but that’s plenty.

However, another thing recent elections have shown us is that uncertainty estimates are important: that’s what Nate Silver and almost no-one else got right in the US. The big limitation of simple, transparent poll of poll aggregators is that they say nothing useful about uncertainty.

May 14, 2017

There’s nothing like a good joke

You’ve probably seen the 2016 US election results plotted by county, as in this via Brilliant Maps

It’s not ideal, because large, relatively empty counties take up a lot of space but represent relatively few people.  It’s still informative: you can see, for example, that urban voters tended to support Clinton even in Texas.  There are also interesting blue patches in rural areas that you might need an atlas to understand.

For most purposes, it’s better to try to show the votes, such as this from the New York Times, where the circle area is proportional to the lead in votes

You might want something that shows the Electoral College votes, since those are what actually determines the results, like this by Tom Pearson for the Financial Times

Or, you might like pie charts, such as this one from Lisa Charlotte Rost


These all try to improve on the simple county map by showing votes — people — rather than land. The NYT one is more complex than the straightforward map; the other two are simpler but still informative.


Or, you could simplify the county map in another way. You could remove all the spatial information from within states — collecting the ‘blue’ land into one wedge and the ‘red’ land into another — and not add anything. You might do this as a joke, to comment on the President’s use of the simple county map

The problem with the Internet, though, is that people might take it seriously.  It’s not completely clear whether Chris Cillizza was just trolling, but a lot of people sure seem to take his reposting of it seriously.

May 4, 2017

Summarising a trend

Keith Ng drew my attention on Twitter to an ad from Labour saying “Under National, the number of young people not earning or learning has increased by 41%”.

When you see this sort of claim, you should usually expect two things: first, that the claim will be true in the sense that there will be two numbers that differ by 41%; second, that it will not be the most informative summary of the data in question.

If you look on Infoshare, in the Household Labour Force Survey, you can find data on NEET (not in education, employment, or training).  The number was 64100 in the fourth quarter of 2008, when Labour lost the election.  It’s now (Q1, 2017) 90800, which is, indeed, 41% higher.  Let’s represent the ad by a graph:



We can fill in the data points in between:
Now, the straight line doesn’t look as convincing.

Also, why are we looking at the number, when population has changed over this time period. We really should care about the rate (percentage)
Measuring in terms of rates the increase is smaller — 27%.  More importantly, though, the rate was even higher at the end of the first quarter of National’s administration than it is now.

The next thing to notice is the spikes every four quarters or so: NEET is higher in the summer and lower in the winter because of the school  year.  You might wonder if StatsNZ had produced a seasonally adjusted version, and whether it was also conveniently on Infoshare…
The increase is now 17%

But for long-term comparisons of policy, you’d probably want a smoothed version that incorporates more than one quarter of data. It turns out that StatsNZ have done this, too, and it’s on Infoshare.
The increase is, again 17%. Taking out the seasonal variation, short-term variation, and sampling noise makes the underlying pattern clearer.  NEET increased dramatically in 2009, decreased, and has recently spiked. The early spike may well have been the recession, which can’t reasonably be blamed on any NZ party.  The recent increase is worrying, but thinking of it as trend over 9 years isn’t all that helpful.

May 3, 2017

A century of immigration

Given the discussions of immigration in the past weeks, I decided to look for some historical data.  Stats NZ has a report “A Century of Censuses”, with a page on ‘proportion of population born overseas.” Here’s the graph


The proportion of immigrants has never been very low, but it fell from about 1 in 2 in the late 19th century to about 1 in 6 in the middle of the 2oth century, and has risen to about 1 in 4 now. The increase has been going on for the entire lifetime of any NZ member of Parliament; the oldest was born roughly at Peak Kiwi in the mid-1940s.

Seeing that immigrants have been a large minority of New Zealand for over a century doesn’t necessarily imply anything about modern immigration policy — Hume’s Guillotine, “no ought deducible from is,” cuts that off.  But I still think some people would find it surprising.


April 26, 2017

Simplifying to make a picture

1. has maps of the ancestry structure of North America, based on people who sent DNA samples in for their genotype service (click to embiggen)ncomms14238-f3

To make these maps, they looked for pairs of people whose DNA showed they were distant relatives, then simplified the resulting network into relatively stable clusters. They then drew the clusters on a map and coloured them according to what part of the world those people’s distant ancestors probably came from.  In theory, this should give something like a map of immigration into the US (and to a lesser extent, of remaining Native populations).  The map is a massive oversimplification, but that’s more or less the point: it simplifies the data to highlight particular patterns (and, necessarily, to hide others).  There’s a research paper, too.


2. In a satire on predictive policing, The New Inquiry has an app showing high-risk neighbourhoods for financial crime. There’s also a story at Buzzfeed.


The app uses data from the US Financial Regulatory Authority (FINRA), and models the risk of financial crime using the usual sort of neighbourhood characteristics (eg number of liquor licenses, number of investment advisers).


3. The Sydney Morning Herald had a social/political quiz “What Kind of Aussie Are You?”.


They also have a discussion of how they designed the 7 groups.  Again, the groups aren’t entirely real, but are a set of stories told about complicated, multi-dimensional data.


The challenge in any display of this type is to remove enough information that the stories are visible, but not so much that they aren’t true– and not everyone will agree on whether you’ve succeeded.