Election poll averaging
The basic chart looks like this
The scatter of points around the trend line shows the sampling uncertainty. The fact that the blue dots are above the line and the black dots are below the line is important, and is one of the limitations of NZ polls. At the last election, NZ First did better, and National did worse, than in the polling just before the election. The trend estimates basically assume that this discrepancy will keep going in the future. The alternative, since we’ve basically got just one election to work with, is to assume it was just a one-off fluke and tells us nothing.
We can’t distinguish these options empirically just from the poll results, but we can think about various possible explanations, some of which could be disproved by additional evidence. One possibility is that there was a spike in NZ First popularity at the expense of National right at the election, because of Winston Peters’s reaction to the teapot affair. Another possibility is that landline telephone polls systematically undersample NZ First voters. Another is that people are less likely to tell the truth about being NZ First voters (perhaps because of media bias against Winston or something). In the US there are so many elections and so many polls that it’s possible to estimate differences between elections and polls, separately for different polling companies, and see how fast they change over time. It’s harder here. (update: Danyl Mclauchlan points me to this useful post by Gavin White)
You can see some things about different polling companies. For example, in the graph below, the large red circles are the Herald-Digipoll results. These seem a bit more variable than the others (they do have a slightly smaller sample size) but they don’t seem biased relative to the other polls. If you click on the image you’ll get the interactive version. This is the trend without bias correction, so the points scatter symmetrically around the trend lines but the trend misses the election result for National and NZ First.
Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »