Posts from January 2014 (43)

January 16, 2014

Private-sector surveillance

A maths-free article on data mining and surveillance, from the New York Review of Books

Using techniques ranging from supermarket loyalty cards to targeted advertising on Facebook, private companies systematically collect very personal information, from who you are, to what you do, to what you buy. Data about your online and offline behavior are combined, analyzed, and sold to marketers, corporations, governments, and even criminals. The scope of this collection, aggregation, and brokering of information is similar to, if not larger than, that of theNSA, yet it is almost entirely unregulated and many of the activities of data-mining and digital marketing firms are not publicly known at all.

January 15, 2014

He’s a forestry export statistic and he’s ok

From the Herald

Log exports to China are driving an increase in the value of New Zealand’s forestry products – which has more than doubled in the past 20 years, according to figures from Statistics New Zealand.

and from Stats NZ, this infographic (click to embiggen, as usual)

forestry-exports-infographic-jpg

 

In both the infographic and the Herald story there’s interesting information about changes in the makeup of NZ forestry exports (more logs, more to China), but in both cases the headline number is a bit misleading.

The change is from $1.9 billion in 1992 dollars to $4.5 billion in 2012 dollars. It’s not explicit that the dollar values are nominal, but they are — if I were a newspaper I’d say “evidence obtained by StatsChat under the Official Information Act”, which is to say I asked the always-helpful @StatisticsNZ twitter account.

My general view is that comparing nominal dollars twenty years apart is Not Even Wrong, but I have to admit it’s not obvious what the ideal adjustment would be. Consumer Price Index? Producer Price Index? Something that involves exchange rates? For that reason, here’s a selection of adjustments

  • CPI: $1.9 billion in 1992 is $2.97 billion now, using the Reserve Bank’s calculator, which all journalists should have bookmarked
  • PPI (outputs, all industries) $1.9b in 1992 is $3 billion now
  • PPI (inputs, all industries) $1.9b in 1992 is $3.1 billion now
  • As a proportion of GDP, forestry has fallen from 2.5% to 2.1%
  • As a proportion of all exports, forestry has fallen from 10.4% to 9.4%

So, there’s a reasonable degree of agreement between measures that forestry has increased in value about 50% and has fallen slightly as a fraction of the economy. “More than doubled” doesn’t seem defensible.

(via @kiwieric)

Briefly

‘Approaches to measurement’ edition

Meet Geyang Mao, Statistics summer scholar

Every year, the Department of Statistics offers summer scholarships to a number of students so they can work with our staff on real-world projects. Geyang is working with Dr Ian Tuck on a project called Correlations between air pollutants.  Geyang, below, explains:

“It is likely that the levels of air pollutants are somehow related in time if they arise from a common source such as vehicle emissions. Relationships between levels of pollutants could also be influenced by meteorological conditions, such as wind speed. Furthermore, the relationship may change in time as the nature of emissions changes due to technological change or emission controls.

Geyang Mao

“My research is about investigating the nature of the correlation between a number of key air pollutants, using data collected by Auckland Council and National Institute of Water and Atmospheric Research.  This project mainly focuses on predicting PM2.5 from PM10 and investigating correlation between particle number concentration (PNC) and nitrogen oxides (NOx).

“PM2.5 refers to particles smaller than 2.5 micrometers, which tend to penetrate into the gas-exchange regions of the lung. They can adversely affect human health and also have impacts on climate and precipitation. It is becoming very important to measure PM2.5 level in Auckland Council’s air-quality monitoring network. However, most weather stations in New Zealand monitor just the level of PM10. It is too expensive to measure all pollutants at all times, therefore, this project will be very useful if we can establish the relationship between correlated pollutants so that concentrations of one may be estimated from concentrations of the other.

“I have just finished my third year of a Bachelor of Science/ Bachelor of Commerce conjoint majoring in Statistics, Accounting and Finance, and I plan to pursue postgraduate studies in Statistics after I graduate.

“Statistics appeals to me because of its relevance to a lot of real-world problems. Stats can be widely applied to almost every industry. It helps us to make sense of data and extract useful information from large datasets. Furthermore, Statistics provides me with many quantitative skills that are transferable across a wide range of fields and has improved my problem-solving ability significantly. I have found learning stats at the University of Auckland to be very enjoyable due to the great learning environment and friendly lecturers.

“Over the summer, I’m also doing a lot of relaxing: catching up with friends, watching movies, spending some time at the beach and playing computer games.”

 

Fancy packaging of plain packaging impact

The Sydney Morning Herald has a story on the impact of plain packaging for cigarettes in Australia.  Cancer researchers in Sydney found a big spike in calls to Quitline after the packaging change, and interpreted this as evidence it was working

The researchers said although the volume of calls to Quitline was an ”indirect” measure of people’s quitting intentions and behaviour, it was more objective than community surveys where people can answer questions in a socially desirable and biased way.

On the other side, tobacco companies say there hasn’t been any actual fall in smoking.

”In November 2013, a study by London Economics found that since the introduction of plain packaging in Australia there has been no change in smoking prevalence … What matters is whether fewer people are smoking as a result of these policies – and the data is clear that overall tobacco consumption and smoking prevalence has not gone down,” he said.

In this setting you might reasonably be concerned that either side is putting their results in fancy packaging. So what should you believe?

In fact, the claims are consistent with each other and don’t say much either way about the success of the program.  If you look at the research paper, they found an increase peaking at about 300 calls per week and then falling off by about 14% per week. That works out to be a total of roughly 2000 extra calls attributed to the packaging change, ie, just over half a percent of all smokers in Australia, or perhaps a 10% increase in the annual Quitline volume. If the number of people actively trying to quit by methods other than Quitline also goes up by 10%, you still wouldn’t expect to see much impact on total tobacco sales after one year.

The main selling point for the plain packaging (eg) was that it would prevent young people from starting to smoke. That’s what really needs to be evaluated, and it’s probably too early to tell.

 

[Update: Of course, other countries that were independently considering changing their policies shouldn’t wait for years just because Australia started first. That would be silly.]

[Update: the Quitline data are just for NSW; so perhaps 1.5% of smokers]

January 14, 2014

Causation, counterfactuals, and Lotto

A story in the Herald illustrates a subtle technical and philosophical point about causation. One of Saturday’s Lotto winners says

“I realised I was starving, so stopped to grab a bacon and egg sandwich.

“When I saw they had a Lotto kiosk, I decided to buy our Lotto tickets while I was there.

“We usually buy our tickets at the supermarket, so I’m glad I followed my gut on this one,” said one of the couple, who wish to remain anonymous.

Assuming it was a random pick, it’s almost certainly true that if they had not bought the ticket at that Lotto kiosk at that time, they would not have won.  On the other hand, if Lotto is honest, buying at that kiosk wasn’t a good strategy — it had no impact on the chance of winning.

There is a sense in which buying the bacon-and-egg sandwich was a cause of the win, but it’s not a very useful sense of the word ’cause’ for most statistical purposes.

Meet Bor-Kuan Song, Statistics summer scholar

Every year, the Department of Statistics at the University of Auckland offers summer scholarships to a number of students so they can work with our staff on real-world projects.  Hongbin is working with Dr. Steffen Klaere on a project called Comparing and visualising measures of biodiversity. Bor-Kuan (below) explains:

“Our project is about acquiring and modifying data on biodiversity in New Zealand and possibly combining them into a visualisation in the form of maps. We will try look into the interaction between and within New Zealand birds, plants and soil compositions. Bor-Kuan Song “As technology develops and the availability of web storage space increases, it becomes easier to compare data from different sources. This presents an opportunity to assess the biodiversity in the NZ ecosystem, and have a better understanding of it.

“I’m studying a Bachelor of Science majoring in Statistics and Computer Science, and I’ve already finished a Bachelor of Music majoring in accordion performance. I’m into maths as well and have done most of the Stage II maths courses. My goal is to do actuarial studies, and so statistics, and, in part, computer science, will be a big part of it.”

The dangers of better measurement

An NPR News story on back pain and its treatment

One reason invasive treatments for back pain have been rising in recent years, Deyo says, is the ready availability of MRI scans. These detailed, color-coded pictures that can show a cross-section of the spine are a technological tour de force. But they can be dangerously misleading.

This MRI shows a mildly herniated disc. That's the sort of thing that looks abnormal on a scan but may not be causing pain and isn't helped by surgery.

This MRI shows a mildly herniated disc. That’s the sort of thing that looks abnormal on a scan but may not be causing pain and isn’t helped by surgery.

“Seeing is believing,” Deyo says. “And gosh! We can actually see degenerated discs, we can see bulging discs. We can see all kinds of things that are alarming.”

That is, they look alarming. But they’re most likely not the cause of the pain.

Health food research marketing

The Herald has a story about better ways to present nutritional information on foods

“Our study found that those who were presented with the walking label were most likely to make healthier consumption choices, regardless of their level of preventive health behaviour,” Ms Bouton said.

“Therefore, consumers who reported to be unhealthier were likely to modify their current negative behaviour and exercise, select a healthier alternative or avoid the unhealthy product entirely when told they would need to briskly walk for one hour and 41 minutes to burn off the product.

“The traffic light system was found to be effective in deterring consumers from unhealthy foods, while also encouraging them to consume healthy products.”

This sounds good. And this is a randomised experiment, which is an excellent feature.

However, it’s just an online survey of 591 people, about a hypothetical product, so what it actually found was that the labelling system was effective in deterring people from saying they would buy unhealthy foods, encouraging them to say they would consume healthy products and made them more likely to say they would exercise. That’s not quite so good. It’s a lot easier to get people to say they are going to eat better, exercise more, and lose weight that to get them to actually do it.

Another interesting feature is that this new research has appeared on the Herald website before. In October 2012 there was a story based on the first 220 survey responses

Not only were people more likely to exercise when they saw such labels, they also felt more guilty, Ms Bouton said.

“My findings showed that the exercise labelling was significantly more effective in both chocolate and healthier muesli bars in encouraging consumers to exercise after consumption.

“It increased the likelihood of having higher feelings of guilt after consumption and was more likely to stop [the participant] consuming the chocolate bar with the exercise labelling.”

The 2012 story still didn’t raise the issue of what people said versus actual behaviour, but it did get an independent opinion, who pointed out that calories aren’t the only purpose of food labelling.

More importantly, the stories and the two press releases are all the information I could find online about the research. There don’t seem to be any more details either published or in an online report. It’s good to have stories about scientific research, and this sort of experiment is an important step in thinking about food labelling, but the stories are presenting stronger conclusions that can really be supported by a single unpublished online survey.

January 13, 2014

Meet Savannah Post, Statistics summer scholar

Every year, the Department of Statistics offers summer scholarships to a number of students so they can work with our staff on real-world projects. We’ll be profiling them on Stats Chat.

Savannah is working with Professor Alan Lee on a project called Modelling criminal sentencing. Savannah explains:

Savannah Post

“The aim of my project is to identify the factors that are influencing sentencing outcomes in our justice system. Sentencing outcomes include both the type of punishment which is handed down – anything from a discharge without conviction to imprisonment – and also the length of the sentence imposed.

“In an ideal situation, the variation we see in sentencing outcomes should be explained purely by the crimes committed and the criminal history of the convicted person. Problems obviously arise when sentencing outcomes are influenced by illegitimate factors such as race. If race (or any other irrelevant factor) remains an influential factor in sentencing, even after the offending and criminal history have been taken into account, then certain groups in our society are being treated unfairly by the justice system, which is a major concern. The purpose of my research is to identify whether or not this is the case.

“This is useful research as the powers of the justice system are immense and criminal sentencing changes lives: not only the life of the person who is convicted but also the lives of that person’s family, friends and wider community. It’s absolutely essential that those powers are applied in the fairest way – without discrimination, in other words.

“I’ve just finished my third year of a BSc/LLB conjoint degree, so I have another two years to go. This is my second summer research project and I’m really enjoying the opportunity to experience another side of statistics.

“At the moment, I don’t have any clear ideas about where I’d like to go after finishing university, but I’m interested in the developing trend towards evidence-based policies in local and national government. I’m hoping to find an internship or something similar so that I can find out some more about the opportunities available.

“On a personal/academic level, the methodical and logical nature of statistics appeals to me. I also enjoy the combination of mathematical and interpretive skills that statistics requires. On another level, I think statistics is really important for society as a whole, because it can show us truths which would otherwise be obscured, either because the data is too overwhelming for us to comprehend or because our own prejudices have been influencing our perception of that data.

“I had a busy start to the summer, travelling to North America on a choir tour. It was loads of fun (but cold!), particularly since I had never been there before. Aside from my research, my plan for the rest of the summer is to relax and rejuvenate prior to hitting the books again in March.”