Posts filed under Surveys (181)

May 26, 2017

Big fat lies?

This is a graph from the OECD, of obesity prevalence:

The basic numbers aren’t novel. What’s interesting (as @cjsnowdon pointed out on Twitter) is the colour separation. The countries using self-reported height and weight data report lower rates of obesity than those using actual measurements.  It wouldn’t be surprising that people’s self-reported weight, over the telephone, tends to be a bit lower than what you’d actually measure if they were standing in front of you; this is a familiar problem with survey data, and usually we have no real way to tell how big the bias is.

In this example there’s something we can do.  The United States data come from the National Health And Nutrition Examination Surveys (NHANES), which involve physical and medical exams of about 5,000 people per year. The US also runs the Behavioral Risk Factor Surveillance System (BRFSS), which is a telephone interview of half a million people each year. BRFSS is designed to get reliable estimates for states or even individual counties, but we can still look at the aggregate data.

Doing the comparisons would take a bit of effort, except that one of my students, Daniel Choe, has already done it. He was looking at ways to combine the two surveys to get more accurate data than you’d get from either one separately.  One of his graphs shows a comparison of the obesity rate over a 16-year period using five different statistical models. The top right one, labelled ‘Saturated’, is the raw data.

In the US in that year the prevalence of obesity based on self-reported height and weight was under 30%.  The prevalence based on measured height and weight was about 36% — there’s a bias of about 8 percentage points. That’s nowhere near enough to explain the difference between, say, the US and France, but it is enough that it could distort the rankings noticeably.

As you’d expect, the bias isn’t constant: for example, other research has found the relationship between higher education and lower obesity to be weaker when using real measurements than when using telephone data.  This sort of thing is one reason doctors and medical researchers are interested in cellphone apps and gadgets such as Fitbit — to get accurate answers even from the other end of a telephone or internet connection.

April 3, 2017

How big is that?

From Stuff and the Science Media Centre

Dr Sean Weaver’s start-up business has saved over 7000 hectares of native rainforest in Southland and the Pacific

So, how much is that? I wasn’t sure, either.  Here’s an official StatsChat Bogus Poll to see how good your spatial numeracy is;

The recently ex-kids are ok

The New York Times had a story last week with the headline “Do Millennial Men Want Stay-at-Home Wives?”, and this depressing graphnyt

But, the graph doesn’t have any uncertainty indications, and while the General Social Survey is well-designed, that’s a pretty small age group (and also, an idiosyncratic definition of ‘millennial’)

So, I looked up the data and drew a graph with confidence intervals (full code here)


See the last point? The 2016 data have recently been released. Adding a year of data and uncertainty indications makes it clear there’s less support for the conclusion that it looked.

Other people did similar things: Emily Beam has a long post  including some context

The Pepin and Cotter piece, in fact, presents two additional figures in direct contrast with the garbage millennial theory – in Monitoring the Future, millennial men’s support for women in the public sphere has plateaued, not fallen; and attitudes about women working have continued to improve, not worsen. Their conclusion is, therefore, that they find some evidence of a move away from gender equality – a nuance that’s since been lost in the discussion of their work.

and Kieran Healy tweeted


As a rule if you see survey data (especially on a small subset of the population) without any uncertainty displayed, be suspicious.

Also, it’s impressive how easy these sorts of analysis are with modern technology. They used to require serious computing, expensive software, and potentially some work to access the data.  I did mine in an airport: commodity laptop, free WiFi, free software, user-friendly open-data archive.   One reason that basic statistics training has become much more useful in the past few decades is that so many of the other barriers to DIY analysis have been removed.

March 29, 2017

Technological progress in NZ polling

From a long story at

For the first time ever, Newshub and Reid Research will conduct 25 percent of its polling via the internet. The remaining 75 percent of polling will continue to be collected via landline phone calls, with its sampling size of 1000 respondents and its margin of error of 3.1 percent remaining unchanged. The addition of internet polling—aided by Trace Research and its director Andrew Zhu—will aim to enhance access to 18-35-year-olds, as well as better reflect the declining use of landlines in New Zealand.

This is probably a good thing, not just because it’s getting harder to sample people. Relying on landlines leads people who don’t understand polling to assume that, say, the Greens will do much better in the election than in the polls because their voters are younger. And they don’t.

The downside of polling over the internet is it’s much harder to tell from outside if someone is doing a reasonable job of it. From the position of a Newshub viewer, it may be hard even to distinguish bogus online clicky polls from serious internet-based opinion research. So it’s important that Trace Research gets this right, and that Newshub is careful about describing different sorts of internet surveys.

As Patrick Gower says in the story

“The interpretation of data by the media is crucial. You can have this methodology that we’re using and have it be bang on and perfect, but I could be too loose with the way I analyse and present that data, and all that hard work can be undone by that. So in the end, it comes down to me and the other people who present it.”

It does. And it’s encouraging to see that stated explicitly.

January 11, 2017

Bogus poll stories, again

We have a headline today in the HeraldNew Zealand’s most monogamous town revealed“.

At first sight you might be worried this is something new that can be worked out from your phone’s sensor data, but no. It’s the result of a survey, and not even a survey of whether people are monogamous, but of whether they say they agree with the statement “I believe that monogamy is essential in a relationship” as part of the user data for a dating site that emphasises lasting relationships.

To make matters worse, this particular dating site’s marketing focuses on how different its members are from the general population.  It’s not going to be a good basis for generalising to “Kiwis are strongly in favour of monogamy

You can find the press release here (including the embedded map) and the dating site’s “in-depth article” here.

It’s not even that nothing else is happening in the world this week.

November 13, 2016

What polls aren’t good for

From Gallup, how Americans feel about the election


We can believe the broad messages that many people were surprised; that Trump supporters have positive feelings; that Clinton supporters have negative feelings; that there’s more anger and fear expressed that when Obama first was elected (though not than when he was re-elected). The surprising details are less reliable.

I’ve seen people making a lot of the 3% apparent “buyer’s remorse” among Trump voters, with one tweet I saw saying those votes would have been enough to swing the election. First of all, Clinton already has more votes that Trump, just distributed suboptimally, so even if these were Trump voters who had changed their minds it might not have made any difference to the result.  More importantly, though, Gallup has no way of knowing who the respondents voted for, or even if they voted at all.  The table is just based on what they said over the phone.

It could be that 3% of Trump voters regret it. It could also be that some Clinton voters or some non-voters claimed to have voted for Trump.  As we’ve seen in past examples even of high-quality social surveys, it’s very hard to estimate the size of a very small subpopulation from straightforward survey data.

August 6, 2016

Momentum and bounce

Momentum is an actual property of physical objects, and explanations of flight, spin, and bounce in terms of momentum (and other factors) genuinely explain something.  Electoral poll proportions, on the other hand, can only have ‘momentum’ or ‘bounce’ as a metaphor — an explanation based on these doesn’t explain anything.

So, when US pollsters talk about convention bounce in polling results, what do they actually mean? The consensus facts are that polling results improve after a party’s convention and that this improvement tends to be temporary and to produce polling results with a larger error around the final outcome.

Andrew Gelman and David Rothschild have a long piece about this at Slate:

Recent research, however, suggests that swings in the polls can often be attributed not to changes in voter intention but in changing patterns of survey nonresponse: What seems like a big change in public opinion turns out to be little more than changes in the inclinations of Democrats and Republicans to respond to polls. 

As usual, my recommendation is the relatively boring 538 polls-plus forecast, which discounts the ‘convention bounce’ very strongly.

July 31, 2016

Lucifer, Harambe, and Agrabah

Public Policy Polling has a history of asking … unusual… questions in their political polls.  For example, asking if you are in favour of bombing Agrabah (the fictional country of Disney’s Aladdin), whether you think Hillary Clinton has ties to Lucifer, and whether you would vote for Harambe (the dead, 17-yr old gorilla) if running as an independent against Trump and Clinton.

From these three questions, the Lucifer one stands out: it comes from a familiar news issue and isn’t based on tricking the respondents. People may not answer honestly, but at least they know roughly what they are being asked and how it’s likely to be understood.  Since they know what they are being asked, it’s possible to interpret the responses in a reasonably straightforward way.

Now, it’s fairly common when asking people (especially teenagers) about drug use to include some non-existent drugs for an estimate of the false-positive response rate.  It’s still pretty clear how to interpret the results: if the name is chosen well, no respondents will have a good-faith belief that they have taken a drug with that name, but they also won’t be confident that it’s a ringer.  You’re not aiming to trick honest respondents; you’re aiming to detect those that aren’t answering honestly.

The Agrabah question is different. There had been extensive media discussion of the question of bombing various ISIS strongholds (eg Raqqa), and this was the only live political question about bombing in the Middle East. Given the context of a serious opinion poll, it would be easy to have a good-faith belief that ‘Agrabah’ was the name of one of these ISIS strongholds and thus to think you were being asked whether bombing ISIS there was a good idea. Because of this potential confusion, we can’t tell what the respondents actually meant — we can be sure they didn’t support bombing a fictional city, but we can’t tell to what extent they were recklessly supporting arbitrary Middle-Eastern bombing versus just being successfully trolled. Because we don’t know what respondents really meant, the results aren’t very useful.

The Harambe question is different again. Harambe is under the age limit for President, from the wrong species, and dead, so what could it even mean for him to be a candidate?  The charitable view might be that Harambe’s 5% should be subtracted from the 8-9% who say they will vote for real, living, human candidates other than Trump and Clinton. On the other hand, that interpretation relies on people not recognising Harambe’s name — on almost everyone not recognising the name, given that we’re talking about 5% of responses.  I can see the attraction of using a control question rather than a half-arsed correction based on historical trends. I just don’t believe the assumptions you’d need for it to work.

Overall, you don’t have to be very cynical to suspect the publicity angle might have some effect on their question choice.

July 27, 2016

In praise of NZ papers

I whinge about NZ papers a lot on StatsChat, and even more about some of the UK stories they reprint. It’s good sometimes to look at some of the UK stories they don’t reprint.  From the Daily Express


The Brexit enthusiast and cabinet Minister John Redwood says “The poll is great news, well done to the Daily Express.” As he seems to be suggesting, you don’t get results like this just by chance — having an online bogus poll on the website of an anti-Europe newspaper is a good start.

(via Antony Unwin)

May 24, 2016


Headline: “Newshub poll: Key’s popularity plummets to lowest level”

Just 36.7 percent of those polled listed the current Prime Minister as their preferred option — down 1.6 percent — from a Newshub poll in November.

National though is steady on 47 percent on the poll — a drop of just 0.3 percent — and similar to the Election night result.

So, apparently, 0.3% is “steady” and 1.6% is a “plummet”.

The reason we quote ‘maximum margin of error’, even though it’s a crude summary, not a good way to describe evidence, underestimates variability, and is a terribly misleading phrase, is that it at least gives some indication of what is worth headlining.  The maximum margin of error for this poll is 3%, but the margin of error for a change is 1.4 times higher, about 4.3%.

That’s the maximum margin of error, for a 50% true value, but it doesn’t make that much difference– I did a quick simulation to check. If nothing happened, the Prime Minister’s measured popularity would plummet or soar by more than 1.6% between two polls about half the time purely from sampling variation.