Posts filed under General (618)

January 30, 2015

A bit more complicated than that

On Twitter, I got a link to a Telegraph story “One glass of wine increases stroke risk by third”, with the request “Debunk please.”

Not all depressing health news is necessarily wrong. However, even if it’s describing a real risk you can be pretty confident that it will have been exaggerated a bit, and that’s the case here.  It’s probably true that alcohol consumption at not-particularly-high levels increases stroke risk, but it does pay to look more closely at what the research is claiming.

The first thing to notice is that the story shifts from

Middle aged drinkers who down just one large glass of wine a day increase their risk of stroke by a third, warns a new study.


The results showed drinkers in their fifties and sixties who had at least two alcoholic drinks a day…

That is, the research lumped together everyone who averaged two or more standard drinks per day (actually, 2.4 standard drinks/day in NZ units). This group, collectively, had a 34% higher stroke risk than the 0.5 drink/day group, but the group who had 1-2 standard drinks per day were not at any increased risk.  Unless there’s a magic threshold at 2.4 drinks/day of alcohol, the excess risk must be less than 1/3 for people just into the 2.4+ drinks range, and more for people far into the range.

The next step is to try to look at the actual alcohol consumption in each group, to see how far above 2.4/day the highest group was. That’s not given, but some interesting things are. First, only 3% of the people who had strokes drank more than 2.4 drinks/day, so this wasn’t a very good sample for looking at heavy drinking.  The researchers pointed this out themselves: “A potential limitation of our study could be a low proportion of heavy drinkers as alcohol consumption in Sweden is one of the lowest in Europe

What’s more surprising is that 3% of the people who didn’t have strokes also drank more than 2.4 drinks/day.  In fact, the mean and median alcohol consumption were slightly lower in the people who had strokes than in the people who didn’t.  How can this be?

Part of the explanation is given the Telegraph story

The findings show that blood pressure and diabetes appeared to take over as one of the main influences on having a stroke at around the age of 75.

That is, the alcohol effect was mostly in middle-aged people, where the actual stroke risk is lower. This was the main finding of the research, in fact. It still wouldn’t entirely explain the lack of difference in alcohol consumption, but there’s also probably a contribution from the statistical model they used and the way it handles people who die of something other than a stroke, and from lower risk in light drinkers than in non-drinkers.

The other question to ask, always,  is what other research there is. Here’s a graph from a meta-analysis combining 27 alcohol and stroke studies published last year (click to embiggen). It also shows an increase, but not as dramatic as the new study.

Since the new study wasn’t particularly well suited to looking at the effect of heavier drinking (it would have been better for looking at non-drinkers vs light drinkers), there isn’t much case for preferring the single new study estimate over the combined estimate.  According to this estimate, at 2 drinks/day there isn’t convincing evidence of increased risk. Above 3 drinks/day there is, and it goes up rapidly after that. Another meta-analysis in 2010 found broadly similar results, as did one in 2003 in the prestigious journal JAMA.

So, not really debunked, but not quite as bad as it sounds.

Meet Statistics summer scholar Ying Zhang

Ying Zhang Photo

Every year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Ying, right, is working on a project called Service overview, client profile and outcome evaluation for Lifeline Aotearoa Face-to-Face Counselling Services  with the Department of Statistics’ Associate Professor David Scott and Christine Dong, research and clinical engagement manager, Lifeline and also an Honorary Research Fellow in the Department of Psychological Medicine at the University of Auckland. Ying explains:

“Lifeline New Zealand is a leading provider of dedicated community helpline services, face-to-face counselling and suicide prevention education. The project aims to investigate the client profile, the clinical effectiveness of the service and client experiences of, and satisfaction with, the face-to-face counselling service.

“In this project, my work includes three aspects: Data entry of client profiles and counselling outcomes; qualitative analysis of open-ended questions and descriptive analysis; and modelling for the quantitative variables using SAS.

“Very few research studies have been done in New Zealand to explore client profiles or find out clients’ experiences of, and satisfaction with, community face-to-face counselling services. Therefore, the study will add evidence in terms of both clinical effectiveness and client satisfaction. This study will also provide a systematic summary of the demographics and clinical characteristics of people accessing such services. It will help provide direction for strategies to improve the quality and efficiency of the service.

“I have just graduated from the University of Auckland with a Postgraduate Diploma in Statistics.  I got my bachelor and master degrees majoring in information management and information systems at Zhejiang University in China.

“My first contact with statistics was around 10 years ago when I was at university in China. It was an interesting but complex subject for me. After that, I did some internship work relating to data analysis. It helped me accumulate more experience about using data analysis to help inform business decisions.

“This summer, apart from participating in the project, I will spend some time expanding my knowledge of SAS – it’s a very useful tool and I want to know it better. I’m also hoping to find a full-time job in data analysis.”





January 29, 2015

30 years is longer than one week

From Stuff, on the housing affordability index

“The university said a key driver was the median house price, which rose more than $30,000 over the year, eclipsing the $19.35 increase in average weekly wages.

Interest rates also rose from 5.51 per cent to 5.97 per cent on average.”

Comparing the median house price increase to the median (I think) individual weekly wage and salary income increase is a particularly opaque way of presenting the data. Obviously $30,000 is a lot more than $19.35, but one is paid over thirty years an the other is received over one week.

For example, it should be easy to say what increase in average weekly earnings would be necessary to not be ‘eclipsed’ by the $30,000 house price increase? If the report doesn’t say, the journalist should ask. The reader shouldn’t have to do that calculation. It turns out that if median weekly wages had risen $34.50 instead of $19.35, they wouldn’t have been eclipsed and the affordability index would have stayed constant. This isn’t the impression that you’d get from the story.

The argument for an affordability index is that it makes affordability changes easier to understand by reducing them to a single number.  That’s only true either if you understand how the number is calculated (which takes quite a lot of research) or you don’t really care exactly what it means.


  • “When 2000 people take aspirin for one year, one heart attack is prevented.” A story on absolute risk and number-needed-to-treat, at the NY Times Upshot blog.  They introduce this as related to personalised medicine, but it’s really not.

Meet Statistics summer scholar Oliver Stevenson

Oliver StevensonEvery year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Oliver, right, is working on a project called Maps and graphics for animal populations with Associate Professor Rachel Fewster. Oliver explains:

“This project involves dealing with data from various conservation projects around the country. The data primarily consists of catch rates of various animal species at different locations of a project. My job is to come up with new ideas for maps, graphics and charts that conservation volunteers will find engaging, and that will illustrate the positive impact their work is having on New Zealand’s environment.

“The project is aimed at motivating the general public who are involved in local conservation schemes. When they return from a day’s work, they will get to see the rewards of their labours presented on a map, as well as personalised charts showing their own contribution to the project. Ideally, this keeps them motivated and coming back for more!

I recently completed my Bachelor of Science majoring in Statistics and minoring in Psychology at the University of Otago. I am originally from Auckland, and have returned to pursue a Bachelor of Science (Honours) in Statistics in 2015.

I enjoy statistics as I believe it can be applied to almost any aspect of life. Data exists in so many subjects and occupations: commerce, medicine, law, sports, the environment – anything you can think of!

“Where there is data, we can use statistics to gain a deeper understanding of the underlying processes taking place and better understand the world around us. Because statistics covers such a wide range of topics, I’m always working with something different, which keeps the subject interesting.

This summer, hopefully I will find some time to get away and do some camping and get the chance to play a few games of cricket in the sun.”



January 28, 2015

Meet Statistics summer scholar Kai Huang

Kai Huang croppedEvery year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Kai, right, is working on a project called Constrained Additive Ordination with Dr Thomas Yee. Kai explains:

“In the early 2000s, Dr Thomas Yee proposed a new technique in the field of ecology called Constrained Additive Ordination (CAO) that solves the problems about the shape of species’ response curves and how they are distributed along unknown underlying gradients, and meanwhile the CAO-oriented Vector Generalised Linear and Additive Models (VGAM) package for R has been developed. This summer, I am compiling code for improving performance for the VGAM package by facilitating the integration of R and C++ under the R environment.

“This project brings me the chance to work with a package in worldwide use and stimulates me to learn more about writing R extensions and C++ compilation. I don’t have any background in ecology, but I acquired a lot before I started this project.

“I just have done the one-year Graduate Diploma in Science in Statistics at the University of Auckland after graduating from Massey University at Palmerston North with a Bachelor of Business Studies in Finance and Economics. In 2015, I’ll be doing an honours degree in Statistics. Statistics is used in every field, which is awesome to me.

“This summer, I’ll be spending my days rationally, working with numbers and codes, and at night, romantically, spending my spare time with stars. Seeing the movie Interstellar [a 2014 science-fiction epic that features a crew of astronauts who travel through a wormhole in search of a new home for humanity] reignited my curiosity about the universe, and I have been reading astronomy and physics books in my spare time this summer. I even bought an annual pass to Stardome, the planetarium at Auckland, and have spent several evenings there.”


January 27, 2015

Meet Statistics summer scholar Eric Lim

IMG_0069Every year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Eric, right, is working on a project called Accessible graphics for data on maps with Professor Chris Wild. Eric explains:

“I am working on an easy-to-use data-analysis system called iNZight  that has been developed by Professor Chris Wild and his students at the University of Auckland. The primary purpose of iNZight is to allow students to experience exploring many different types of statistics, and it has been successfully deployed in many situations to produce significant results.

“My main task is to implement a simple geographical information system (GIS) in iNZight so that students can draw maps, visualise geographical information, learn and interpret patterns they reveal.

“Knowing where things happen is important, especially in looking for or displaying spatial relationships in areas such as crime, health, education, population, environmental resource management, market analysis, highway maintenance, accident monitoring, and emergency planning and routing.

“Geographical data are also very interesting and fun to look at, and I would like to present iNZight users with visually appealing and informative maps. A picture is worth a thousand words!

“I am from South Korea. I studied applied mathematics and statistics for my undergraduate degree, and recently finished my honours degree in statistics at the University of Auckland. I am hoping to study a masters in 2015.

“I am fascinated by patterns hidden inside data that can only be seen by using appropriate statistical methods. Learning different statistical techniques to effectively bring out the patterns is naturally my biggest interest and passion.

“I particularly love statistics because of its wide range of use in many areas such as finance, ecology, computing and many more.”





School fee/real-estate arithmetic

There’s an interesting piece in the Herald arguing that the effective school fees you pay by living in one of the top school zones in Auckland aren’t great value, and that you’d be better off just paying private-school fees explicitly. It’s a good point, but I think the calculations in the article are missing something:

Where a school zone boundary sliced through the middle of a suburban street, in-zone houses were up to $272,000 more expensive than comparable properties on the other side of the road.

“Over the life of a 20-year mortgage, at a fixed mortgage rate of 6.5 per cent, the extra $272,000 it costs to buy a home ‘in-zone’, with interest, equates to an outlay of $486,710. That’s almost half a million dollars.

That’s compared to private-school fees that could easily total “more than $100,000 per student over five years”.

There are two points that don’t get addressed explicitly in the article. Firstly, many people have more than one child. Secondly, the money spent on school-zone real estate isn’t gone, it’s a speculative investment.

Using their figures (because I’m lazy), if you subtract two kids at $100,000 school fees from the $486,710 real-estate plus interest you get $286,710. If the real-estate premium for the school zones keeps up with inflation, you basically break even, with the possibility of a big loss (if boundaries are redrawn) or a big gain (if prices keep going up).

If you’re the sort of person the article is aimed at, there’s a good chance you’ve already got more of your money in Auckland real estate than is ideal, so speculating on the Grammar Zone premium might not be a good investment, but it’s not self-evidently bad.

There are a couple of surprising points about the article. First, you would hope this is the sort of calculation anyone would already be doing before planning to spend the thick end of million bucks. Second, the fact that real-estate prices can go up as well as down is not something the Herald usually misses.

January 26, 2015

Meet Statistics summer scholar Rahul Singhal 

Rahul SinghalEvery year, the Department of Statistics at the University of Auckland offers summer scholarships to a number of students so they can work with staff on real-world projects. Rahul Singhal, right, is working on a project called Developing Bias Weights for the New Zealand Longitudinal Census with Professor Alan Lee. Rahul explains:

“The project attempts to adjust for linkage bias in the New Zealand longitudinal census – to reduce this bias as much as possible.

“When we link people from one census to another, those people who have been linked may differ from those that could not be linked, that is, the non-linked people may have different characteristics from the linked people.

“The bias can result in a tendency to overestimate or underestimate important relationships between variables, such as the effect of a person’s occupation on mortality risk.  This tendency could potentially result in incorrect conclusions. Thus, this project could be very helpful for other projects that use the New Zealand Longitudinal Census to investigate the effect of different variables.

“I have just finished my conjoint BA/BCom degree in Statistics, Economics, Accounting and Finance.  Statistics has interested me ever since I took the Statistics 108 course, Statistics for Commerce, in which I learned about the power and flexibility of statistics. It is the main reason why I decided to go from a single degree to a conjoint degree.

“I don’t have too much planned for my summer break, just visiting family in India, as I haven’t seen them for a few years.”


January 23, 2015

Meet Statistics summer scholar Bo Liu

Photo Bo LiuEvery year, the Department of Statistics offers summer scholarships to a number of students so they can work with staff on real-world projects. Bo, right, is working on a project called Construction of life-course variables for the New Zealand Longitudinal Census (NZLC) with Roy Lay-Yee, Senior Research Fellow at the COMPASS Research Centre, University of Auckland, and Professor Alan Lee of Statistics. Bo explains:

“The New Zealand Longitudinal Census has linked individuals across the 1981-2006 New Zealand censuses. This enables the assessment of life-course resources with various outcomes.

“I need to create life-course variables such as socio-economic status, health, education, work, family ties and cultural identity from the censuses. Sometimes such information is not given directly in the census questions, but several pieces of information need to be combined together.

“An example is the overcrowding index that measures the personal living space. We need to combine the age, partnership status of the residents and number of bedrooms in each dwelling to derive the index.

“Also, the format of the questionnaire as well as the answers used in each census were rather different, so data-cleaning is required. I need to harmonise information collected in each census so that they are consistent and can be compared over different censuses. For example, in one census the gender might be given code ‘0’ and ‘1’ representing female and male, but in another census the gender was given code ‘1’ and ‘2’. Thus the code ‘1’ can mean quite different things in different censuses. My job is to find these differences and gaps in each census.

“The results of this project will enable future studies based on New Zealand longitudinal censuses, say, for example, the influence of life-courses variables on the risk of mortality. This project will also be a very good experience for my future career, since data-cleaning is a very important process that we were barely taught in our courses but will actually cost almost one-third of the time in most real-life projects. When we were studying statistics courses, most data sets we encountered were “toy” data sets that had fewer variables and observations and were clean. However, in real life, as in this case, we often meet with data that have millions of observations, hundreds of variables, and inconsistent variable specification and coding.

“I hold a Bachelor of Commerce in Accounting, Finance and Information Systems. I have just completed Postgraduate Diploma in Science, majoring in Statistics, and in 2015, I will be doing Master of Science in Statistics.

“When I was studying information systems, my lecturer introduced several statistical techniques to us and I was fascinated by what statistics is capable of in the decision-making process. For example, retailers can find out if a customer is pregnant purely based on her purchasing behaviour, so the retailers can send out coupons to increase their sales. It is amazing how we can use statistical techniques to find that little tiny bit of useful information in oceans of data. Statistics appeals to me as it is highly useful and applicable in almost every industry.

“This summer, I will spend some time doing road trips – hopefully I can make it to the South Island this time. I enjoy doing road trips alone every summer as I feel this is the best way to get myself refreshed and motivated for the next year.”