Posts filed under Random variation (51)

May 9, 2013

Counting signatures

A comment on the previous post about the asset-sales petition asked how the counting was done: the press release says

Upon receiving the petition the Office of the Clerk undertook a counting and sampling process. Once the signatures had been counted, a sample of signatures was taken using a methodology provided by the Government Statistician.

It’s a good question and I’d already thought of writing about it, so the commenter is getting a temporary reprieve from banishment for not providing a full name.  I don’t know for certain, and the details don’t seem to have been published, which is a pity — they would be interesting and educationally useful, and there doesn’t seem to be any need for confidentiality.

While I can’t be certain, I think it’s very likely that the Government Statistician provided the estimation methodology from Statistics New Zealand Working Paper No 10-04, which reviews and extends earlier research on petition counting.

There are several issues that need to be considered

  • removing signatures that don’t come with the required information
  • estimating the number of eligible vs ineligible signatures
  • estimating the number of duplicates
  • estimating the margin of error in the estimate
  • deciding what level of uncertainty is acceptable

The signatures without the required information are removed completely; that’s not based on sampling.  Estimating eligible vs ineligible signatures is fairly easy by checking a sufficiently-large random sample — in fact, they use a systematic sample, taking names at regular intervals through the petition list, which tends to give more precise results and to be more auditable.  

Estimating unique signatures is  tricky, because if you halve your sample size, you expect to see 1/4 as many duplicates, 1/8 as many triplicates, and so on. The key part of the working paper shows how to scale up the the sample data on eligible, ineligible, and duplicate, triplicate, etc, signatures to get the unique unbiased estimator of the number of valid signatures and its variance.

Once the level of uncertainty is specified, the formulas tell you what sample size to verify and what to do with the results.  I don’t know how the sample size is chosen, but it wouldn’t take a very large sample to get the uncertainty down to a few thousand, which would be good enough.   In fact, since the methodology is public and the parties have access to the electoral roll in electronic form, it’s a bit surprising that the petition organisers didn’t run a quick check themselves before submitting it.

 

 

May 8, 2013

Does emergency hospital choice matter?

The Herald has a completely over-the-top presentation of what might be an important issue. The headline is “Hospital choice key to kids’ survival”, and the story starts off

Where ambulances take badly injured children first seems to affect their chances, paediatric surgeons say.

Starship children’s hospital surgeons have found that sending badly injured children to the wrong hospital may be contributing to a child death rate from injuries that is twice the rate of Australia’s.

The data:

Six (7 per cent) of the 88 children who went first to Middlemore died, but so did one (8 per cent) of the 12 who went directly to Starship.

That is, to the extent the data tell us anything, the evidence is against the headline.  Of course, the uncertainties are huge: a 95% confidence interval for the relative odds of dying after being sent to Middlemore goes from a 40-fold decrease to a 12-fold increase.  There’s basically no information in the survival data.

So, how much of the two-fold higher rate of death in NZ compared to Australia could reasonably be explained by suboptimal hospital choice? One of the surgeons involved in the study says

… overseas research showed that a good trauma protocol system could cut the death rate for injured adults by 20 to 30 per cent, but there was no good data for children.

That is, hardly any of the difference between NZ and Australia — especially as this specific hospital-choice issue only applies to one sector of one city in New Zealand, with less than 10% of the national population.

On the other hand, we see

The head of Starship’s emergency department, Dr Mike Shepherd, said the major factors contributing to New Zealand’s high fatal injury rate for children lay outside the hospital system in policies such as driver blood-alcohol limits, graduated driver licensing, and laws requiring children’s booster seats and swimming pool fences.

That sounds plausible, but if it’s the whole story you would expect high levels of non-fatal as well as fatal injuries. The overall rate of hospitalisations for injuries in children 0-14 years is almost identical in NZ (1395 per 100 000 per year, p29) and Australia (‘about’ 1500 per 100 000 per year, page v).

 

May 6, 2013

Chocolate bait and switch

Headline:  Study: Dark chocolate calms you down

Lead:

Eating dark chocolate can calm you down according to a new study.

Number of people actually given dark chocolate in the study: 0  (more…)

Some surprising things

  • From Felix Salmon: US population is increasing, and people are moving to the cities, so why is (sufficiently fine-scale) population density going down? Because rich people take up more space and fight for stricter zoning.  You’ve heard of NIMBYs, but perhaps not of BANANAs
  • From the New York Times.  One of the big credit-rating companies is no longer using debts referred for collection as an indicator, as long as they end up paid.  This isn’t a new spark of moral feeling, it’s just for better prediction.
  • And from Felix Salmon again: Firstly, Americans are bad at statistics. When it comes to breast cancer, they massively overestimate the probability that early diagnosis and treatment will lead to a cure, while they also massively underestimate the probability that an undetected cancer will turn out to be harmless.
May 3, 2013

Screening

Stuff has a story  (borrowed from the West Island) headlined “Over 40? Five tests you need right now”.

You might have expected some reference to the other recent news about screening: that the US Preventive Services Taskforce has now joined the Centers for Disease Control in recommending universal screening for HIV (as TVNZ reported).  It’s not clear if New Zealand will follow the trend — HIV infection in people not in high risk groups is less common here than in the US, so the benefit compared to more selective screening is smaller here.   This illustrates the complexities of population screening.  Not only does the test have to be accurate, especially in terms of its false positive rate, but there needs to be something useful you can do about a positive result, and screening everyone has to be better than just screening selected people.

So, let’s  compare the suggestions from Stuff’s story to what national and international expert guidelines say you need.

Two of the tests, for high blood pressure and high cholesterol, are spot on. These are part of the national 2012/13 Health Targets for DHBs, with the goal being 75% of the eligible population having the tests within a five-year period.  The Health Target also includes blood glucose measurement to diagnose diabetes, which the story doesn’t mention.  The US Preventive Services Taskforce also recommends blood pressure and cholesterol tests, though it recommends universal diabetes screening only after age 50 or in people with high blood pressure or people with risk factors for diabetes.

One of the tests recommended in the story is a depression/anxiety questionnaire, for diagnosing suicide risk.  Just a couple of weeks ago, the US Preventive Services Taskforce issued guidelines on universal screening for suicide prevention by GPs, saying that there wasn’t good enough evidence to recommend either for or against. As the coverage from Reuters explains, these questionnaires do probably identify people at higher risk of suicide, but it wasn’t clear how much benefit came from identifying them. So, that’s not an unreasonable test to recommend, but it would have been better to indicate that it was controversial.

One more of the tests isn’t a screening test at all — the story recommends that you make sure you know what a standard drink of alcohol is.

The top recommendation in the story, though, is coronary calcium screening.  The US Preventive Services Taskforce recommends against coronary calcium screening for people at low risk of heart disease and says there isn’t enough evidence to recommending for or against in people at higher risk for other reasons. The  American Heart Association also recommends against routine coronary calcium screening (they say it might be useful as a tiebreaker in people known to be at intermediate risk of heart disease based on other factors).  No-one doubts that calcium in the walls of your coronary arteries is predictive of heart disease, but people with high levels of coronary calcium tend to also be overweight or smokers, or have high blood pressure or high cholesterol or diabetes — and if they don’t have these other risk factors  it’s not clear that anything can be done to help them.

As an afterthought on the coronary calcium screening point, the story has an additional quote from a doctor recommending coronary angiography before starting a serious exercise program.  I’d never heard of coronary angiography as a general screening recommendation — it’s a bit more invasive and higher-risk than most population screening. It turns out that the American College of Cardiology and the American Heart Association are similarly unenthusiastic, with their guidelines on use specifically recommending against angiography for screening of people without symptoms of coronary artery disease.

 

 

April 12, 2013

Random numbers from a radioactive source

Here’s a fun link which talks about the difference between truly random numbers and pseudo-random numbers. When we teach this, we often mention generation of random numbers (or at least the random number seed) from a radioactive source as one way of getting truly random numbers. Here is someone actually doing it. The sequel is well worth a watch too if you have the time.

April 11, 2013

Power failure threatens neuroscience

A new research paper with the cheeky title “Power failure: why small sample size undermines the reliability of neuroscience” has come out in a neuroscience journal. The basic idea isn’t novel, but it’s one of these statistical points that makes your life more difficult (if more productive) when you understand it.  Small research studies, as everyone knows, are less likely to detect differences between groups.  What is less widely appreciated is that even if a small study sees a difference between groups, it’s more likely not to be real.

The ‘power’ of a statistical test is the probability that you will detect a difference if there really is a difference of the size you are looking for.  If the power is 90%, say, then you are pretty sure to see a difference if there is one, and based on standard statistical techniques, pretty sure not to see a difference if there isn’t one. Either way, the results are informative.

Often you can’t afford to do a study with 90% power given the current funding system. If you do a study with low power, and the difference you are looking for really is there, you still have to be pretty lucky to see it — the data have to, by chance, be more favorable to your hypothesis than they should be.   But if you’re relying on the  data being more favorable to your hypothesis than they should be, you can see a difference even if there isn’t one there.

Combine this with publication bias: if you find what you are looking for, you get enthusiastic and send it off to high-impact research journals.  If you don’t see anything, you won’t be as enthusiastic, and the results might well not be published.  After all, who is going to want to look at a study that couldn’t have found anything, and didn’t.  The result is that we get lots of exciting neuroscience news, often with very pretty pictures, that isn’t true.

The same is true for nutrition: I have a student doing a Honours project looking at replicability (in a large survey database) of the sort of nutrition and health stories that make it to the local papers. So far, as you’d expect, the associations are a lot weaker when you look in a separate data set.

Clinical trials went through this problem a while ago, and while they often have lower power than one would ideally like, there’s at least no way you’re going to run a clinical trial in the modern world without explicitly working out the power.

Other people’s reactions

April 6, 2013

Gun deaths visualisation

Periscopic, a “socially conscious data visualization firm” has produced an interactive display of the years of life lost due to gun violence in the US, based on national life expectancy data. Each victim appears as a dot moving along the arc of their life, and then dropping at the age of death. More and more accumulate as you watch.

guns

 

Of course, it’s important to remember that this display gets a lot of its power from two facts: the USA is very big, and we know the names and ages of death of gun victims.  You couldn’t do the same thing as dramatically for smoking deaths, and it would look much less impressive in a small country.

 

Also, Alberto Cairo has a nice post using this as an example to talk about the display of uncertainty.

(via @hildabast)

 

April 1, 2013

Briefly

Despite the date, this is not in any way an April Fools post

  • “Data is not killing creativity, it’s just changing how we tell stories”, from Techcrunch
  • Turning free-form text into journalism: Jacob Harris writes about an investigation into food recalls (nested HTML tables are not an open data format either)
  • Green labels look healthier than red labels, from the Washington Post. When I see this sort of research I imagine the marketing experts thinking “how cute, they figured that one out after only four years”
  • Frances Woolley debunks the recent stories about how Facebook likes reveal your sexual orientation (with comments from me).  It’s amazing how little you get from the quoted 88% accuracy, even if you pretend the input data are meaningful.  There are some measures of accuracy that you shouldn’t be allowed to use in press releases.
March 30, 2013

Confirmation bias

From the Waikato Times, two quotes from a story on emergency services

He would not comment on what motivated the fracas or whether it was gang-related.

“We’re not jumping to conclusions.”

and

Though science dismisses any link between human behaviour and the moon, it’s cold comfort for hospitality staff and emergency workers who say the amount of trouble often spikes when the moon is at its brightest.

Ms Gill said staff reported that “the full moon often has an impact on the nature of presentations through ED”.

Science doesn’t dismiss a link.  There’s nothing unscientific about the idea of a link. It’s  just that people have looked carefully and it’s not true.

(via @petrajane)