Posts filed under Surveys (151)

April 14, 2015

Northland school lunch numbers

Last week’s Stat of the Week nomination for the Northern Advocate didn’t, we thought point out anything particularly egregious. However, it did provoke me to read the story — I’d previously only  seen the headline 22% statistic on Twitter.  The story starts

Northland is in “crisis” as 22 per cent of students from schools surveyed turn up without any or very little lunch, according to the Te Tai Tokerau Principals Association.

‘Surveyed’ is presumably a gesture in the direction of the non-response problem: it’s based on information from about 1/3 of schools, which is made clear in the story. And it’s not as if the number actually matters: the Te Tai Tokerau Principals Association basically says it would still be a crisis if the truth was three times lower (ie, if there were no cases in schools that didn’t respond), and the Government isn’t interested in the survey.

More evidence that number doesn’t matter is that no-one seems to have done simple arithmetic. Later in the story we read

The schools surveyed had a total of 7352 students. Of those, 1092 students needed extra food when they came to school, he said.

If you divide 1092 by 7352 you don’t get 22%. You get 15%.  There isn’t enough detail to be sure what happened, but a plausible explanation is that 22% is the simple average of the proportions in the schools that responded, ignoring the varying numbers of students at each school.

The other interesting aspect of this survey (again, if anyone cared) is that we know a lot about schools and so it’s possible to do a lot to reduce non-response bias.  For a start, we know the decile for every school, which you’d expect to be related to food provision and potentially to response. We know location (urban/rural, which district). We know which are State Integrated vs State schools, and which are Kaupapa Māori. We know the number of students, statistics about ethnicity. Lots of stuff.

As a simple illustration, here’s how you might use decile and district information.  In the Far North district there are (using Wikipedia because it’s easy) 72 schools.  That’s 22 in decile one, 23 in decile two, 16 in decile three, and 11 in deciles four and higher.  If you get responses from 11 of the decile-one schools and only 4 of the decile-three schools, you need to give each student in those decile-one schools a weight of 22/11=2 and each student in the decile-three schools a weight of 16/4=4. To the extent that decile predicts shortage of food you will increase the precision of your estimate, and to the extent that decile also predicts responding to the survey you will reduce the bias.

This basic approach is common in opinion polls. It’s the reason, for example, that the Green Party’s younger, mobile-phone-using support isn’t massively underestimated in election polls. In opinion polls, the main limit on this reweighting technique is the limited amount of individual information for the whole population. In surveys of schools there’s a huge amount of information available, and the limit is sample size.

February 19, 2015

West Island census under threat?

From the Sydney Morning Herald

Asked directly whether the 2016 census would go ahead as planned on August 9, a spokeswoman for the parliamentary secretary to the treasurer Kelly O’Dwyer read from a prepared statement.

It said: “The government and the Bureau of Statistics are consulting with a wide range of stakeholders about the best methods to deliver high quality, accurate and timely information on the social and economic condition of Australian households.”

Asked whether that was an answer to the question: “Will the census go ahead next year?” the spokeswoman replied that it was.

Unlike Canada, it’s suggested they would at least save money in the short term. It’s the longer-term consequences of reduced information quality that are a concern — not just directly for Census questions, but for all surveys that use Census data to compensate for sampling bias. How bad this would be depends on what is used to replace the Census: if it’s a reasonably large mandatory-response survey (as in the USA), it could work well. If it’s primarily administrative data, probably not so much.

In New Zealand, the current view is that we do still need a census.

Key findings are that existing administrative data sources cannot at present act as a replacement for the current census, but that early results have been sufficiently promising that it is worth continuing investigations.

 

February 3, 2015

Meet Statistics summer scholar Daniel van Vorsselen

Every year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Daniel, right, is working on a project called Working with data from conservation monitoring schemes with Associate Professor Rachel Fewster. Daniel explains:

Daniel Profile Picture“The university is involved in a project called CatchIT, an online system that aims to help community conservation schemes by proving users with a place where they can input and store their data for reference. The project also produces maps and graphics so that users can assess the effectiveness of their conservation schemes and identify areas where changes can be made.

“My role in the project is to help analyse the data that users put into the project. This involves correctly formatting and cleaning the data so that it is usable. I assist users in the technical aspects relating to their data and help them communicate their data in a meaningful way.

“It’s important to maintain and preserve the wildlife and plant species we have in New Zealand so that future generations have the opportunity to experience them as we have. Our environments are a defining factor of our culture and lifestyles as New Zealanders and we have a large amount of native species in New Zealand. It would be a shame to see them eradicated.

“I am currently studying a BCom/BA conjoint, majoring in Statistics, Economics and Finance. I’m hoping to do Honours in statistics and I am looking at a career in banking.

“Over summer, I hope to enjoy the nice weather, whether out on the boat fishing, at the beach or going for a run.”

 

 

 

 

January 31, 2015

Big buts for factoid about lying

At StatsChat, we like big buts, and an easy way to find them is unsourced round numbers in news stories. From the Herald (reprinted from the Telegraph, last November)

But it’s surprising to see the stark figure that we lie, on average, 10 times a week.

It seems that this number comes from an online panel survey in the UK last year (Telegraph, Mail) — it wasn’t based on any sort of diary or other record-keeping, people were just asked to come up with a number. Nearly 10% of them said they had never lied in their entire lives; this wasn’t checked with their mothers.  A similar poll in 2009 came up with much higher numbers: 6/day for men, 3/day for women.

Another study, in the US, came up with an estimate of 11 lies per week: people were randomised to trying not to lie for ten weeks, and the 11/week figure was from the control group.  In this case people really were trying to keep track of how often they lied, but they were a quite non-representative group. The randomised comparison will be fair, but the actual frequency of lying won’t be generalisable.

The averages are almost certainly misleading, because there’s a lot of variation between people. So when the Telegraph says

The average Briton tells more than 10 lies a week,

or the Mail says

the average Briton tells more than ten lies every week,

they probably mean the average number of self-reported lies was more than 10/week, with the median being much lower. The typical person lies much less often than the average.

These figures are all based on self-reported remembered lies, and all broadly agree, but another study, also from the US, shows that things are more complicated

Participants were unaware that the session was being videotaped through a hidden camera. At the end of the session, participants were told they had been videotaped and consent was obtained to use the video-recordings for research.

The students were then asked to watch the video of themselves and identify any inaccuracies in what they had said during the conversation. They were encouraged to identify all lies, no matter how big or small.

The study… found that 60 percent of people lied at least once during a 10-minute conversation and told an average of two to three lies.

 

 

January 23, 2015

Meet Statistics summer scholar Bo Liu

Photo Bo LiuEvery year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Bo, right, is working on a project called Construction of life-course variables for the New Zealand Longitudinal Census (NZLC) with Roy Lay-Yee, Senior Research Fellow at the COMPASS Research Centre, University of Auckland, and Professor Alan Lee of Statistics. Bo explains:

“The New Zealand Longitudinal Census has linked individuals across the 1981-2006 New Zealand censuses. This enables the assessment of life-course resources with various outcomes.

“I need to create life-course variables such as socio-economic status, health, education, work, family ties and cultural identity from the censuses. Sometimes such information is not given directly in the census questions, but several pieces of information need to be combined together.

“An example is the overcrowding index that measures the personal living space. We need to combine the age, partnership status of the residents and number of bedrooms in each dwelling to derive the index.

“Also, the format of the questionnaire as well as the answers used in each census were rather different, so data-cleaning is required. I need to harmonise information collected in each census so that they are consistent and can be compared over different censuses. For example, in one census the gender might be given code ‘0’ and ‘1’ representing female and male, but in another census the gender was given code ‘1’ and ‘2’. Thus the code ‘1’ can mean quite different things in different censuses. My job is to find these differences and gaps in each census.

“The results of this project will enable future studies based on New Zealand longitudinal censuses, say, for example, the influence of life-courses variables on the risk of mortality. This project will also be a very good experience for my future career, since data-cleaning is a very important process that we were barely taught in our courses but will actually cost almost one-third of the time in most real-life projects. When we were studying statistics courses, most data sets we encountered were “toy” data sets that had fewer variables and observations and were clean. However, in real life, as in this case, we often meet with data that have millions of observations, hundreds of variables, and inconsistent variable specification and coding.

“I hold a Bachelor of Commerce in Accounting, Finance and Information Systems. I have just completed Postgraduate Diploma in Science, majoring in Statistics, and in 2015, I will be doing Master of Science in Statistics.

“When I was studying information systems, my lecturer introduced several statistical techniques to us and I was fascinated by what statistics is capable of in the decision-making process. For example, retailers can find out if a customer is pregnant purely based on her purchasing behaviour, so the retailers can send out coupons to increase their sales. It is amazing how we can use statistical techniques to find that little tiny bit of useful information in oceans of data. Statistics appeals to me as it is highly useful and applicable in almost every industry.

“This summer, I will spend some time doing road trips – hopefully I can make it to the South Island this time. I enjoy doing road trips alone every summer as I feel this is the best way to get myself refreshed and motivated for the next year.”

 

 

 

January 21, 2015

How to feel good about New Zealand

StatsChat criticises the NZ media a lot, but if you really want a target-rich zone, the place is the UK. Today, the Daily Express had this front page:

B703j6kIcAEiyUl

The biggest vote on this country’s ties to ­Brussels for 40 years saw 80 per cent say they no longer want to be in Europe, the ­Daily Express can reveal.

It marks a huge leap forward in this news­paper’s crusade to get Britain out of the EU.

 

This comes from a survey in three Conservative electorates in the southern UK (out of 650 electorates), where 100,000 questionnaires were distributed. About 12% said Britain should leave the EUK, about 3% were opposed, and the other 85% didn’t respond.

Other, better-conducted polling doesn’t find such a dramatic lead. Even a late-December poll by “Get Britain Out” found only 51% support for leaving the EU and consoled themselves by describing this as showing their campaign was gaining momentum.

(via @federicacocco)

January 20, 2015

Ask a silly question, get a silly answer

The monthly US FoodDemand survey added some questions about government policies this time around. Mostly these were reasonable (eg, do you support a tax on sugared sodas, which got 39% ‘Yes”, the same as here; do you support a ban on sale of marijuana, 46% yes)

However, one question was

“Do you support mandatory labeling for foods containing DNA?”

There’s no way this is a sensible question about government policies: it isn’t a reasonable policy or one that has been under public debate.  Most foods will contain DNA, the exceptions being distilled spirits, some candy, and (if you don’t measure too carefully) white rice and white flour. Nevertheless, 80% of people were in favour.

There was also a question “Do you support mandatory labeling for foods produced with genetic engineering”. This got 82% support.

It seems most likely that many respondents interpreted these questions as basically the same: they wanted labelling for food containing DNA that was added or modified by genetic engineering.  This isn’t what the researchers meant, since they write

A large majority (82%) support mandatory labels on GMOs, but curiously about the same amount (80%) also support mandatory labels on foods containing DNA.

If you ask a question that is nuts when interpreted precisely, but is basically similar to a sensible question, people are going to answer the question they think you meant to ask. People are helpful that way, even when it isn’t helpful.

January 6, 2015

Foreign drivers, again

The Herald has a poll saying 61% of New Zealanders want to make large subsets of foreign drivers sit written and practical tests before they can drive here (33.9%: people from right-hand drive countries; 27.4% everyone but Australians). It’s hard to tell how much of this is just the push effect of being asked the questions and how much is real opinion.

The rationale is that foreign drivers are dangerous:

Overseas drivers were found at fault in 75 per cent of 538 injury crashes in which they were involved. But although failure to adjust to local conditions was blamed for seven fatal crashes, that was the suspected cause of just 26 per cent of the injury crashes.

This could do with some comparisons.  75% of 538 is 403, which is about 4.5% of all injury crashes that year.  We get about 2.7 million visitors per year, with a mean stay of 20 days (PDF), so on average the population is about 3.3% short-term visitors.

Or, we can look at the ‘factors involved’ for all the injury crashes. I get 15367  drivers of motorised vehicles involved in injury crashes, and 9192 of them have a contributing factor that is driver fault (causes 1xx to 4xx in the Crash Analysis System). This doesn’t include things like brake failures.  So, drivers on average are at fault in about 60% of the injury crashes they are involved in.

Based on this, it looks as though foreign drivers are somewhat more dangerous, but that restricting them is very unlikely to prevent more than, say, 1-2% of crashes. If you consider all the ways we might reduce injury crashes by 1-2%, and think about the side-effects of each one, I don’t think this is going to be near the top of the list.

January 2, 2015

Maybe not a representative sample

The Dominion Post asked motorists why they thought the road toll had climbed, and what should be done about it.

roadtoll

Interestingly, three of the five(middle-aged, white, male ,Wellington area) motorists attributed it to random variation. That’s actually possible: the evidence for a real change in risk nationally is pretty modest (and the Wellington region toll is down on last year).

(via @anderschri5 on Twitter)

December 29, 2014

What’s not in a name

I passed up this reprinted advertising-oriented survey story  about “The naughtiest names” the first time it came around. It’s back.

The findings come from a survey that looked at the names of more than 63,000 school children who logged good behaviour or achievement awards in online sticker books.

Those with the most good behaviour awards were named Jacob and Amy, closely followed by Georgia and Daniel.

Coincidentally, I’ve been listening to the BBC production of Good Omens, by Terry Pratchett and Neil Gaiman. It’s available online for the next three weeks. People who like that sort of  thing will find it’s the sort of thing they like. Early on, names are being suggested for a baby who turns out to be the Antichrist:

“Wormwood’s a nice name..Or Damien. Damien’s very popular….Or Cain. Very modern sound, Cain, really.”

This attempt to suggest ‘the naughtiest name’ failed dismally, and that’s probably true of the British survey as well.  The survey is probably a bit more representative of the population, but Good Omens is probably more realistic about the impact of names on the behaviour of children.

If you go to the original source, you see the originators of the survey didn’t really believe it either:

Neil Hodges, School Stickers Managing Director says, “The annual ‘Santa’s Naughty and Nice list’ is just a bit of fun, and obviously there are many Ella’s and Joseph’s that are perfect little angels, just as I’m sure there are many Amy’s and Jacobs that can be a bit of a handful.

though most of the mainstream media stories lost the disclaimer. This time it wasn’t the press release that was to blame.

It’s not that names have no effect. There’s a lot of research showing that identical job applications, for example, may be handled differently if different names are attached. There’s also a lot of social information in names — the story mentions research showing that you’re much more likely to get into Oxford or Cambridge if you’re called Eleanor than if you’re called Jade.

It’s possible there is some effect beyond social stratification and teacher prejudices, but this sort of survey is hopelessly unfit to reveal it.  That’s not the worst aspect, though. Even if the patterns of behaviour and name were real, they are soon going to be out of date. Patterns of first names change quite quickly, and this data presumably refers to kids who were named 5-10 years ago.  ‘Eleanor’ is now one of the names on the Naughty list.