Posts filed under Polls (105)

June 5, 2015

Peacocks’ tails and random-digit dialing

People who do surveys using random-digit phone number dialing tend to think that random-digit dialling or similar attempts to sample in a representative way are very important, and sometimes attack the idea of public-opinion inference from convenience samples as wrong in principle.  People who use careful adjustment and matching to calibrate a sample to the target population are annoyed by this, and point out that not only is statistical modelling a perfectly reasonable alternative, but that response rates are typically so low that attempts to do random sampling also rely heavily on explicit or implicit modelling of non-response to get useful results.

Andrew Gelman has a new post on this issue, and it’s an idea that I think should be taken more further (in a slightly different direction) than he seems to.

It goes like this. If it becomes widely accepted that properly adjusted opt-in samples can give reasonable results, then there’s a motivation for survey organizations to not even try to get representative samples, to simply go with the sloppiest, easiest, most convenient thing out there. Just put up a website and have people click. Or use Mechanical Turk. Or send a couple of interviewers with clipboards out to the nearest mall to interview passersby. Whatever. Once word gets out that it’s OK to adjust, there goes all restraint.

I think it’s more than that, and related to the idea of signalling in economics or evolutionary biology, the idea that peacock’s tails are adaptive not because they are useful but because they are expensive and useless.

Doing good survey research is hard for lots of reasons, only some involving statistics. If you are commissioning or consuming a survey you need to know whether it was done by someone who cared about the accuracy of the results, or someone who either didn’t care or had no clue. It’s hard to find that out, even if you, personally, understand the issues.

Back in the day, one way you could distinguish real surveys from bogus polls was that real surveys used random-digit dialling, and bogus polls didn’t. In part, that was because random-digit dialling worked, and other approaches didn’t so much. Almost everyone had exactly one home phone number, so random dialling meant random sampling of households, and most people answered the phone and responded to surveys.  On top of that, though, the infrastructure for random-digit dialling was expensive. Installing it showed you were serious about conducting accurate surveys, and demanding it showed you were serious about paying for accurate results.

Today, response rates are much lower, cell-phones are common, links between phone number and geographic location are weaker, and the correspondence between random selection of phones and random selection of potential respondents is more complicated. Random-digit dialling, while still helpful, is much less important to survey accuracy than it used to be. It still has a lot of value as a signalling mechanism, distinguishing Gallup and Pew Research from Honest Joe’s Sample Emporium and website clicky polls.

Signalling is valuable to the signaller and to consumer, but it’s harmful to people trying to innovate.  If you’re involved with a serious endeavour in public opinion research that recruits a qualitatively representative panel and then spends its money on modelling rather than on sampling, you’re going to be upset with the spreading of fear, uncertainty, and doubt about opt-in sampling.

If you’re a panel-based survey organisation, the challenge isn’t to maintain your principles and avoid doing bogus polling, it’s to find some new way for consumers to distinguish your serious estimates from other people’s bogus ones. They’re not going to do it by evaluating the quality of your statistical modelling.

 

May 17, 2015

Polling is hard

Part One: Affiliation and pragmatics

The US firm Public Policy Polling released a survey of (likely) US Republican primary voters last week.  This firm has a habit of including the occasional question that some people would consider ‘interesting context’ and others would call ‘trolling the respondents.’

This time it was a reference to the conspiracy theory about the Jade Helm military exercises in Texas: “Do you think that the Government is trying to take over Texas or not?”

32% of respondents said “Yes”. 28% said “Not sure”. Less than half were confident there wasn’t an attempt to take over Texas. There doesn’t seem to be widespread actual belief in the annexation theory, in the sense that no-one is doing anything to prepare for or prevent it. We can be pretty sure that most of the 60% were not telling the truth. Their answer was an expression of affiliation rather than an accurate reflection of their beliefs. That sort of thing can be problem for polling.

Part Two: Mode effects and social pressure

The American Association for Public Opinion Research is having their annual conference, so there’s new and exciting survey research coming out (to the extent that ‘new and exciting survey research’ isn’t an oxymoron). The Pew Research Center took two random groups of 1500 people from one of their panels and asked one group questions over the phone and the other group the same questions on a web form.  For most questions the two groups agreed pretty well: not much more difference than you’d expect from random sampling variability. For some questions, the differences were big:

mode-study-01

It’s not possible to tell from these data which set of answers is more accurate, but the belief in the field is that people give more honest answers to computers than to other people.

March 31, 2015

Polling in the West Island: cheap or good?

New South Wales has just voted, and the new electorate created where I lived in Sydney 20 years ago is being won by the Greens, who got 46.4% of the primary vote and currently 59.7% on preferences. The ABC News background about the electorate says

In 2-party preferred terms this is a safe Labor seat with a margin of 13.7%, but in a two-candidate contest would be a marginal Green seat versus Labor. The estimated first preference votes based on the 2011 election are Green 35.5%, Labor 30.4%, Liberal 21.0%, Independent 9.1, the estimated Green margin after preferences being 4.4% versus Labor.

There was definitely a change since 2011 in this area, so how did the polls do? Political polling is a bit harder with preferential voting when there are only two relevant parties, but much harder when there are more than two.

Well, the reason for mentioning this is a piece in the Australian saying that the swing to the Greens caught Labor by surprise because they’d used cheap polls for electorate-specific prediction

“We just can’t poll these places accurately at low cost,” a Labor strategist said. “It’s too hard. The figures skew towards older voters on landlines and miss younger voters who travel around and use mobile phones.”

The company blamed in the story is ReachTEL. They report that they had the most accurate overall results, but their published poll from 19 March for Newtown is definitely off a bit, giving the Greens 33.3% support.

(via Peter Green on Twitter)

 

December 20, 2014

Not enough pie

From James Lee Gilbert on Twitter, a pie chart from WXII News (Winston-Salem, North Carolina)

pie

This is from a (respectable, if pointless) poll conducted in North Carolina. As you can clearly see, half of the state favours the local team. Or, as you can clearly see from the numbers, one-third of the state does.

If you’re going to use a pie chart (which you usually shouldn’t), remember that the ‘slices of pie’ metaphor is the whole point of the design. If the slices only add up to 70%, you need to either add the “Other”/”Don’t Know”/”Refused” category, or choose a different graph.

If your graph makes it easy to confuse 1/3 and 1/2, it’s not doing its job.

December 8, 2014

Political opinion: winning the right battles

From Lord Ashcroft (UK, Conservative) via Alex Harroway (UK, decidedly not Conservative), an examination of trends in UK opinion on a bunch of issues, graphed by whether they favour Labour or the Conservatives, and how important they are to respondents. It’s an important combination of information, and a good way to display it (or it would be if it weren’t a low-quality JPEG)

Ashcroft-Chart

 

Ashcroft says

The higher up the issue, the more important it is; the further to the right, the bigger the Conservative lead on that issue. The Tories, then, need as many of these things as possible to be in the top right quadrant.

Two things are immediately apparent. One is that the golden quadrant is pretty sparsely populated. There is currently only one measure – being a party who will do what they say (in yellow, near the centre) – on which the Conservatives are ahead of Labour and which is of above average importance in people’s choice of party.

and Alex expands

When you campaign, you’re trying to do two things: convince, and mobilise. You need to win the argument, but you also need to make people think it was worth having the argument. The Tories are paying for the success of pouring abuse on Miliband with the people turned away by the undignified bully yelling. This goes, quite clearly, for the personalisation strategy in general.

November 5, 2014

US election graphics

Facebook has a live map of who has mentioned on Facebook that they had voted (via Jason Sundram)

facebook-voted

USA Today showed a video including a Twitter live map

twitter-elections

These both have the usual problem with maps of how many people do something: there are more people in some places than others. As usual, XKCD puts it well:

xkcd-elections

Useful statistics is about comparisons, and this comparison basically shows that more people live in New York than in New Underwood.

As usual, the New York Times has informative graphics, including a live set of projections for the interesting seats.

 

September 19, 2014

Not how polling works

The Herald interactive for election results looks really impressive. The headline infographic for the latest poll, not so much. The graph is designed to display changes between two polls, for which the margin of error is 1.4 times higher than in a single poll: the margin of error for National goes beyond the edge of the graph.

election-diff

 

The lead for the story is worse

The Kim Dotcom-inspired event in Auckland’s Town Hall that was supposed to end John Key’s career gave the National Party an immediate bounce in support this week, according to polling for the last Herald DigiPoll survey.

Since both the Dotcom and Greenwald/Snowden Moments of Truth happened in the middle of polling, they’ve split the results into before/after Tuesday.  That is, rather than showing an average of polls, or even a single poll, or even a change from a single poll, they are headlining the change between the first and second halves of a single poll!

The observed “bounce” was 1.3%. The quoted margin of error at the bottom of the story is 3.5%, from a poll of 775 people. The actual margin of error for a change between the first and second halves of the poll is about 7%.

Only in the Internet Party’s wildest dreams could this split-half comparison have told us anything reliable. It would need the statistical equivalent of the CSI magic video-zoom enhance button to work.

 

September 18, 2014

Interactive election results map

The Herald has an interactive election-results map, which will show results for each polling place as they come in, together with demographic information about each electorate.  At the moment it’s showing the 2011 election data, and the displays are still being refined — but the Herald has started promoting it, so I figure it’s safe for me to link as well.

Mashblock is also developing an election site. At the moment they have enrolment data by age. Half the people under 35 in Auckland Central seem to be unenrolled,which is a bit scary. Presumably some of them are students enrolled at home, and some haven’t been in NZ long enough to enrol, but still.

Some non-citizens probably don’t know that they are eligible — I almost missed out last time. So, if you know someone who is a permanent resident and has lived in New Zealand for a year, you might just ask if they know about the eligibility rules. Tomorrow is the last day.

September 8, 2014

Poll meta-analyses in NZ

As we point out from time to time, single polls aren’t very accurate and you need sensible averaging.

There are at least three sets of averages for NZ:

1. Peter Green’s analyses, which get published at DimPost (larger parties, smaller parties). The full code is here.

2. Pundit’s poll of polls. They have a reasonably detailed description of their approach and it follows what Nate Silver did for the US elections.

3. Curiablog’s time and size weighted average. Methodology described here

The implementors of these cover a reasonable spectrum of NZ political affiliation. The results agree fairly closely except for one issue: Peter Green adds a correction to make the predictions go through the 2011 election results, which no-one else does.

According to Gavin White, there is a historical tendency for National to do a bit worse and NZ First to do a bit better in the election than in the polls, so you’d want to correct for this, but you could also argue that the effect was stronger than usual at the last election so this might overcorrect.

In addition to any actual changes in preferences over the next couple of weeks, there are three polling issues we don’t have a good handle on:

  • Internet Mana is new, and you could make a plausible case that their supporters might be harder for the  pollers to get a good grip on (note: age and ethnicity aren’t enough here, the pollers do take account of those).
  • There seems to have been a big increase in ‘undecided‘ responders to the polls, apparently from former Labour voters. To the extent that this is new, no-one really knows what they will do on the day.
  • Polling for electorates is harder, especially when strategic voting is important, as in Epsom.

 

[Update: thanks to Bevan Weir in comments, there’s also a Radio NZ average. It’s a simple unweighted average with no smoothing, which isn’t ideal for estimation but has the virtue of simplicity]

August 28, 2014

Bogus polls

This is a good illustration of why they’re meaningless…

Bogus polls