Explore the budget
Keith Ng’s budget visualisation now has today’s newly-released Government budget.
(update) There’s also one at Stuff, by Harkanwal Singh (note that it uses nominal, not inflation-adjusted amounts)
Keith Ng’s budget visualisation now has today’s newly-released Government budget.
(update) There’s also one at Stuff, by Harkanwal Singh (note that it uses nominal, not inflation-adjusted amounts)
There’s currently discussion in NZ about whether to change the 5-yearly census. North America is providing some examples of what not to do.
Canada decided a while back that they were going to chop most of the questions off the census and put them in a new survey. The new survey is still sent to everyone, but is voluntary — the worst of both worlds, since a much smaller survey would allow for more effort per respondent in follow-up. Frances Woolley compares the race/ethnicity data from the 2006 Census and the new survey: the survey is dramatically overcounting minorities.
In the USA, a Republican congressman has proposed a bill that would stop the Department of Commerce and the Census Bureau from collecting basically anything other than the census. That would wipe out the American Community Survey, the detailed 1%/year sample that provides a wide range of regional data. It would also wipe out the Current Population Survey, used to estimate the unemployment rate. Fortunately for the US economy, there’s no chance of this bill becoming law: the business community hates it, and Senate will never pass it. It’s still worrying that there’s a public-opinion advantage in pretending you want to abolish the government’s economic data collection.
A comment on the previous post about the asset-sales petition asked how the counting was done: the press release says
Upon receiving the petition the Office of the Clerk undertook a counting and sampling process. Once the signatures had been counted, a sample of signatures was taken using a methodology provided by the Government Statistician.
It’s a good question and I’d already thought of writing about it, so the commenter is getting a temporary reprieve from banishment for not providing a full name. I don’t know for certain, and the details don’t seem to have been published, which is a pity — they would be interesting and educationally useful, and there doesn’t seem to be any need for confidentiality.
While I can’t be certain, I think it’s very likely that the Government Statistician provided the estimation methodology from Statistics New Zealand Working Paper No 10-04, which reviews and extends earlier research on petition counting.
There are several issues that need to be considered
The signatures without the required information are removed completely; that’s not based on sampling. Estimating eligible vs ineligible signatures is fairly easy by checking a sufficiently-large random sample — in fact, they use a systematic sample, taking names at regular intervals through the petition list, which tends to give more precise results and to be more auditable.
Estimating unique signatures is tricky, because if you halve your sample size, you expect to see 1/4 as many duplicates, 1/8 as many triplicates, and so on. The key part of the working paper shows how to scale up the the sample data on eligible, ineligible, and duplicate, triplicate, etc, signatures to get the unique unbiased estimator of the number of valid signatures and its variance.
Once the level of uncertainty is specified, the formulas tell you what sample size to verify and what to do with the results. I don’t know how the sample size is chosen, but it wouldn’t take a very large sample to get the uncertainty down to a few thousand, which would be good enough. In fact, since the methodology is public and the parties have access to the electoral roll in electronic form, it’s a bit surprising that the petition organisers didn’t run a quick check themselves before submitting it.
As you know, the petition for a referendum over asset sales has not reached its goal yet, due to lots of invalid signatures. This is not a new problem — the petition over the anti-smacking law initially had 17% invalid signatures and also fell short of its threshold on the first round — but it does seem to be worse than usual.
3News displayed this graph of the shortfall
It seemed to me that the 16,500 bar was a bit wider that I’d expect, so I checked on the video from the website. On my screen capture, which I think is what you get if you click on the image, the black bar has 872 signatures per pixel, the blue bar has 1018 signatures per pixel, the whole red bar has 535 signatures per pixel, and the 16500 shortfall has 232 signatures per pixel. That is, the vertical scale for the shortfall is about four times that for the valid signatures.
I’m really not accusing 3News of deliberately distorting the numbers — it looks to me as if the shortfall bar has been made the right height to contain its text, that the blue+red bars height is scaled to the available screen estate, and that the black bar is scaled to the total blue+red height . But it’s a pity that the result is to amplify the visual size of the shortfall — and if the visual size weren’t important the graph would be a complete waste of time.
Scaled in proportion, the bars look like this
Dan Kahan, a researcher in the Cultural Cognition project at Yale Law School, has an interesting post on “the science communication problem”
The motivation behind this research has been to understand the science communication problem. The “science communication problem” (as I use this phrase) refers to the failure of valid, compelling, widely available science to quiet public controversy over risk and other policy relevant facts to which it directly speaks. The climate change debate is a conspicuous example, but there are many others
Two opportunities for public comment that will expire soon, and where StatsChat readers might have something to say
This sort of public comment is qualitative, rather than quantitative. Neither the Select Committee nor Stats New Zealand is likely to count up the number of submissions taking a particular view and use this as a population estimate, because that would be silly. What they should be aiming for is a qualitatively exhaustive sample, one that includes all the arguments for or against the bill, or all the different ways people use Census data.
JustSpeak is
a non-partisan network of young people speaking to, and speaking up for a new generation of thinkers who want change in our criminal justice system.
I’m linking because they have a good visualisation of the recently-released police crime statistics, comparing the proportion of apprehensions leading to prosecution among Maori and Pakeha youth. The back-to-back bar charts take advantage of the brain’s ability to detect lack of symmetry.
I probably would have left out the homicide category, which has too few to compare, and it would be interesting to see if small gaps between the categories help.
The real problem is in interpretation. It’s hard to say what you’d expect just from economic differences and differences in where people live, without any differences in how they are treated by police. A higher proportion of prosecutions could mean the police are using their discretion to prosecute more Maori youth, but a lower proportion of prosecutions could just as easily have been interpreted as harassment of innocent Maori youth.
Keith Ng’s annual NZ Budget visualization seems to be up. Go play.
You might also like last years’ one. And possibly even the 2011 radioactive space donut.
If you’re one of the 40,000 or so people who has signed the Alltrials petition you will have received an email from Ben Goldacre asking for more help.
The Declaration of Helsinki, the major document on research ethics in medicine, already states
30. Authors, editors and publishers all have ethical obligations with regard to the publication of the results of research. Authors have a duty to make publicly available the results of their research on human subjects and are accountable for the completeness and accuracy of their reports. They should adhere to accepted guidelines for ethical reporting. Negative and inconclusive as well as positive results should be published or otherwise made publicly available. Sources of funding, institutional affiliations and conflicts of interest should be declared in the publication. Reports of research not in accordance with the principles of this Declaration should not be accepted for publication.
The petition is trying to get these principles enforced. Publication bias isn’t just a waste of the voluntary participation of (mostly sick) people in research. Publication bias means we don’t know which treatments really work.
In my first job (as a lowly minion) in medical statistics, my boss was Dr John Simes, an oncologist. Back in the 1980s he had shown that publication bias in cancer trials gave the false impression that a more toxic chemotherapy regimen for ovarian cancer had substantial survival benefits to weigh against the side-effects. Looking at all registered (published and unpublished) trials showed the survival benefit was small and quite possibly non-existent. The specific treatment regimens he studied have long been outmoded, but his message is still vitally important.
These examples illustrate an approach to reviewing the clinical trial literature, which is free from publication bias, and demonstrate the value and importance of an international registry of all clinical trials.
Nearly thirty years later, we are still missing information about the benefits and risks of drugs.
For example, influenza researchers have used detailed simulation models to assess control strategies for pandemic flu. These simulation models need data about the effectiveness of drugs and vaccines. When the next flu pandemic hits, we really need these models to be accurate, so it’s especially disturbing that Tamiflu is one of the drugs with substantial unpublished clinical trial data.