Posts filed under Social Media (82)

November 1, 2015

Twitter polls and news feeds


I don’t know why this feels worse that the bogus clicky polls on newspaper websites. Maybe it’s the thought of someone actually believing the sampling scheme says something useful. Maybe it’s being in Twitter, where following a news headline feed usually gets you news headlines. Maybe it’s that the polls are so bad: restricting a discussion of Middle East politics to two options with really short labels makes even the usual slogan-based dialogue look good in comparison.

In any case, I really hope this turns out to be a failed experiment, and that we can keep Twitter polls basically as jokes.


October 30, 2015

Pie charts “a menace”, study shows

StatsChat can reveal exclusive study results showing that pie charts are a menace to over 75% of us.

Although these round, delicious, data metaphors have been maligned in the past, this is the first research of its kind, based on newly-available survey technology.

Researchers used an online, multi-wave, respondent-driven sampling scheme to reach thousands of potential respondents. 77% of responses agreed that pie charts are a menace.


Aren’t these new Twitter polls wonderful?

October 22, 2015

Early NZ data visualisation

From the National Library of New Zealand, via Jolisa Gracewood


Types of motor-vehicle accidents in rural areas vary considerably from those ocourrlng In urban areas, as shown in tho above chart. Tho percentages are based on figures of the Transport Department in respect of accidents causing’ fatalltles during the twelve months, April I, 1932, to March 31, 1933.

The text goes on to say “The black section representing collisions with tram and train forms only I per cent, of the whole, through this type of accident appeals to the popular Imagination’ from its spectacular nature.”  Some things don’t change.

September 21, 2015

It’s bad enough without exaggerating

This UK survey report is being a bit loose with the details, in a situation where that’s not even needed

stem for boys

The survey of more than 4,000 girls, young women, parents and teachers, demonstrates clearly that there is a perception that STEM subjects and careers are better suited to male personalities, hobbies and brains. Half (51 percent) of the teachers and 43 percent of the parents surveyed believe this perception helps explain the low uptake of STEM subjects by girls. [emphasis added]

Those aren’t the same thing at all.  I believe this perception helps explain the low uptake of STEM subjects by girls. Michelle ‘Nanogirl’ Dickinson believes this perception helps explain the low uptake of STEM subjects by girls. It’s worrying that nearly more than half of UK teachers don’t believe this perception helps explain the low uptake of STEM subjects by girls.

On the other hand, this is depressing and actually does seem to be what the survey said:

Nearly half (47 percent) of the young girls surveyed said they believe such subjects are a better match for boys.

as does this

difficult subjects It would fit with NZ experience if a lot of boys felt the same about the difficulty of science and maths, but that wouldn’t actually make it any better.


September 8, 2015

Petitions and other non-representative data

Stuff has a story about the #redpeak  flag campaign, including a clicky bogus poll that currently shows nearly 11000 votes in support of the flag candidate. While Red Peak isn’t my favourite (I prefer Sven Baker’s Huihui),  I like it better than the four official candidates. That doesn’t mean I like the bogus poll.

As I’ve written before, a self-selected poll is like a petition; it shows that at least the people who took part had the views they had. The web polls don’t really even show that — it’s pretty easy to vote two or three times. There’s also no check that the votes are from New Zealand — mine wasn’t, though most of them probably are.  The Stuff clicky poll doesn’t even show that 11,000 people voted for the Red Peak flag.

So far, this Stuff poll at least hasn’t been treated as news. However, the previous one has.  At the bottom of one of the #redpeak stories you can read

In a poll of 16,890 readers, 39 per cent of readers voted to keep the current flag rather than change it. 

Kyle Lockwood’s Silver Fern (black, white and blue) was the most popular alternate flag design, with 27 per cent of the vote, while his other design, Silver Fern (red, white and blue), got 23 per cent. This meant, if Lockwood fans rallied around one of his flags, they could vote one in.

Flags designed by Alofi Kanter – the black and white fern – and Andrew Fyfe each got 6 per cent or less of the vote

They don’t say, but that looks very much like this clicky poll from an earlier Stuff flag story, though it’s now up to about 17500 votes


You can’t use results from clicky polls as population estimates, whether for readers or the electorate as a whole. It doesn’t work.

Over approximately the same time period there was a real survey by UMR (PDF), which found only 52% of people preferred their favourite among the four flags to the current flag.  The referendum looks a lot closer than the clicky poll suggests.

The two Lockwood ferns were robustly the most popular flags in the survey, coming  in as the top two for all age groups; men and women; Māori; and Labour, National and Green voters. Red Peak was one of the four least preferred in every one of these groups.

Only 1.5% of respondents listed Red Peak among their top four.  Over the whole electorate that’s still about 45000, which is why an online petition with 31000 electronic signatures should have about the impact it’s going to have on the government.

Depending on turnout, it’s going to take in the neighbourhood of a million supporting votes for a new flag to overturn the current flag. It’s going to take about the same number of votes ranking Red Peak higher than the Lockwood ferns for it to get on to the final ballot.

In the Stuff story, Graeme Edgeler suggests “Perhaps if there were a million people in a march” would be enough to change the government’s mind. He’s probably right, though I’d say a million estimated from a proper survey, or maybe fifty thousand in a march should be enough. For an internet petition, perhaps two hundred thousand might be a persuasive number, if there was some care taken that they were distinct people and eligible voters.

For those of us in a minority on flag matters, Andrew Geddis has a useful take

In fact, I’m pretty take-it-or-leave-it on the whole point of having a “national” flag. Sure, we need something to put up on public buildings and hoist a few times at sporting events. But I quite like the fact that we’ve got a bunch of other generally used national symbols that can be appropriated for different purposes. The silver fern for putting onto backpacks in Europe. The Kiwi for our armed forces and “Buy NZ Made” logos. The Koru for when we’re feeling the need to be all bi-cultural.

If you like Red Peak, fly it. At the moment, the available data suggest you’re in as much of minority as me.

August 2, 2015

Pie chart of the week

A year-old pie chart describing Google+ users. On the right are two slices that would make up a valid but pointless pie chart: their denominator is Google+ users. On the left, two slices that have completely different denominators: all marketers and all Fortune Global 100 companies.

On top of that, it’s unlikely that the yellow slice is correct, since it’s not clear what the relevant denominator even is. And, of course, though most of the marketers probably identify as male or female, it’s not clear how the Fortune Global 100 Companies would report their gender.


From @NoahSlater, via @LewSOS, originally from kwikturnmedia about 18 months ago.

August 1, 2015

NZ electoral demographics

Two more visualisations:

Kieran Healy has graphs of the male:female ratio by age for each electorate. Here are the four with the highest female proportion,  rather dramatically starting in the late teen years.



Andrew Chen has a lovely interactive scatterplot of vote for each party against demographic characteristics. For example (via Harkanwal Singh),  number of votes for NZ First vs median age



July 28, 2015

Recreational genotyping: potentially creepy?

Two stories from this morning’s Twitter (via @kristinhenry)

  • 23andMe has made available a programming interface (API) so that you can access and integrate your genetic information using apps written by other people.  Someone wrote and published code that could be used to screen users based on sex and ancestry. (Buzzfeed, FastCompany). It’s not a real threat, since apps with more than 20 users need to be reviewed by 23andMe, and since users have to agree to let the code use their data, and since Facebook knows far more about you than 23andMe, but it’s not a good look.
  • Google’s Calico project also does cheap public genotyping and is combining their DNA data (more than a million people) with family trees from This is how genetic research used to be done: since we know how DNA is inherited, connecting people with family trees deep into the past provides a lot of extra information. On the other hand, it means that if a few distantly-related people sign up for Calico genotying, Google will learn a lot about the genomes of all their relatives.

It’s too early to tell whether the people who worry about this sort of thing will end up looking prophetic or just paranoid.

June 18, 2015

Bogus poll story again

For a while, the Herald largely gave up basing stories on bogus clicky poll headlines. Today, though, there was a story about Gurpreet Singh,  who was barred from the Manurewa Cosmopolitan Club for refusing to remove his turban.

The headline is “Sikh club ban: How readers reacted”, and the first sentence says:

Two thirds of respondents to an online NZ Herald poll have backed the controversial Cosmopolitan Club that is preventing turbaned Sikhs from entering due to a ban on hats and headgear.

In some ways this is better than the old-style bogus poll stories that described the results as a proportion of Kiwis or readers or Aucklanders. It doesn’t make the number mean anything much, but presumably the sentence was at least true at the time it was written.

A few minutes ago I looked at the original story and the clicky poll next to it


There are two things to note here. First, the question is pretty clearly biased: to register disagreement with the club you have to say that they were completely in the wrong and that Mr Singh should take his complaint further. Second, the “two thirds of respondents” backing the club has fallen to 40%. Bogus polls really are even more useless than you think they are, no matter how useless you think they are.

But it’s worse than that. Because of anchoring bias, the “two thirds” figure has an impact even on people who know it is completely valueless: it makes you less informed than you were before. As an illustration, how did you feel about the 40% figure in the new results? Reassured that it wasn’t as bad as the Herald had claimed, or outraged at the level of ignorance and/or bigotry represented by 40% support for the club?


June 5, 2015

Peacocks’ tails and random-digit dialing

People who do surveys using random-digit phone number dialing tend to think that random-digit dialling or similar attempts to sample in a representative way are very important, and sometimes attack the idea of public-opinion inference from convenience samples as wrong in principle.  People who use careful adjustment and matching to calibrate a sample to the target population are annoyed by this, and point out that not only is statistical modelling a perfectly reasonable alternative, but that response rates are typically so low that attempts to do random sampling also rely heavily on explicit or implicit modelling of non-response to get useful results.

Andrew Gelman has a new post on this issue, and it’s an idea that I think should be taken more further (in a slightly different direction) than he seems to.

It goes like this. If it becomes widely accepted that properly adjusted opt-in samples can give reasonable results, then there’s a motivation for survey organizations to not even try to get representative samples, to simply go with the sloppiest, easiest, most convenient thing out there. Just put up a website and have people click. Or use Mechanical Turk. Or send a couple of interviewers with clipboards out to the nearest mall to interview passersby. Whatever. Once word gets out that it’s OK to adjust, there goes all restraint.

I think it’s more than that, and related to the idea of signalling in economics or evolutionary biology, the idea that peacock’s tails are adaptive not because they are useful but because they are expensive and useless.

Doing good survey research is hard for lots of reasons, only some involving statistics. If you are commissioning or consuming a survey you need to know whether it was done by someone who cared about the accuracy of the results, or someone who either didn’t care or had no clue. It’s hard to find that out, even if you, personally, understand the issues.

Back in the day, one way you could distinguish real surveys from bogus polls was that real surveys used random-digit dialling, and bogus polls didn’t. In part, that was because random-digit dialling worked, and other approaches didn’t so much. Almost everyone had exactly one home phone number, so random dialling meant random sampling of households, and most people answered the phone and responded to surveys.  On top of that, though, the infrastructure for random-digit dialling was expensive. Installing it showed you were serious about conducting accurate surveys, and demanding it showed you were serious about paying for accurate results.

Today, response rates are much lower, cell-phones are common, links between phone number and geographic location are weaker, and the correspondence between random selection of phones and random selection of potential respondents is more complicated. Random-digit dialling, while still helpful, is much less important to survey accuracy than it used to be. It still has a lot of value as a signalling mechanism, distinguishing Gallup and Pew Research from Honest Joe’s Sample Emporium and website clicky polls.

Signalling is valuable to the signaller and to consumer, but it’s harmful to people trying to innovate.  If you’re involved with a serious endeavour in public opinion research that recruits a qualitatively representative panel and then spends its money on modelling rather than on sampling, you’re going to be upset with the spreading of fear, uncertainty, and doubt about opt-in sampling.

If you’re a panel-based survey organisation, the challenge isn’t to maintain your principles and avoid doing bogus polling, it’s to find some new way for consumers to distinguish your serious estimates from other people’s bogus ones. They’re not going to do it by evaluating the quality of your statistical modelling.