July 17, 2012

Margin of error yet again

In my last post I more-or-less assumed that the design of the opinion polls was handed down on tablets of stone.  Of course, if you really need more accuracy for month-to-month differences, you can get it.   The Household Labour Force Survey gives us the official estimates of unemployment rate.  We need to be able to measure changes in unemployment that are much smaller than a few percentage points, so StatsNZ doesn’t just use independent random samples of 1000 people.

The HLFS sample contains about 15,000 private households and about 30,000 individuals each quarter. We sample households on a statistically representative basis from areas throughout New Zealand, and obtain information for each member of the household. The sample is stratified by geographic region, urban and rural areas, ethnic density, and socio-economic characteristics. 

Households stay in the survey for two years. Each quarter, one-eighth of the households in the sample are rotated out and replaced by a new set of households. Therefore, up to seven-eighths of the same people are surveyed in adjacent quarters. This overlap improves the reliability of quarterly change estimates.

That is, StatsNZ uses a much larger sample, which reduces the sampling error at any single time point, and samples the same households more than once, which reduces the sampling error when estimating changes over time.   The example they give on that web page shows that the margin of error  for annual change in the employment rate is on the order of 1 percentage point.  StatsNZ calculates sampling errors for all the employment numbers they publish, but I can’t find where they publish the sampling errors.

[Update: as has just been pointed out to me, StatsNZ publish the sampling errors at the bottom of each column of the Excel version of their table,  for all the tables that aren’t seasonally adjusted]

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »