From the New York Times: “How One 19-Year-Old Illinois Man Is Distorting National Polling Averages”
There is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election.
He is sure he is going to vote for Donald J. Trump.
I think the story exaggerates the impact of this guy’s opinions on polling averages, but it’s a great illustration of one of the subtleties of polling.
Even in New Zealand, you often see people claiming, for example, that opinion polls will underestimate the Green Party vote because Green voters are younger and more urban, and so are less likely to have landline phones. As we see from the actual elections, that isn’t true. Pollers know about these simple forms of bias, and use weighting to fix them — if they poll half as many young voters as they should, each of their votes counts twice. Weighting isn’t as good as actually having a representative sample, but it’s ok — and unlike actually having a representative sample, it’s achievable.
One of the tricky parts of weighting is which groups to weight. If you make the groups too broadly-defined, you don’t remove enough bias; if you make them too narrowly-defined, you end up with a few people getting really extreme weights, making the sampling error much larger than it should be. That’s what happened here: the survey had one person in one of its groups, and that person turned out to be unusual. But it gets worse.
The impact of the weighting was amplified because this is a panel survey, polling the same people repeatedly. Panel surveys are useful because they allow much more accurate estimation of changes in opinions, but an unlucky sample will persist over many surveys.
Worse still, one of the weighting factors used was how people say they voted in 2012. That sounds sensible, but it breaks one of the key assumptions about weighting variables: you need to know the population totals. We know the totals for how the population really voted in 2012, but reported vote isn’t the same thing at all — people are surprisingly unreliable at reporting how they voted in the past.
The actual impact on polling aggregators such as 538 is probably pretty small, since they model and try to remove ‘house effects’ (differences between surveys). However, the poll does give aid and comfort to people who don’t want to believe the consensus results, and that is not helpful.