December 17, 2014

Good news, bad percentages

In the New York Times, a story reporting on new Ebola research, which suggests there are fewer unreported cases and less transmission in the general community than was previously thought. This is good news both because there aren’t as many cases, and also because control might be easier.

One unfortunate feature of the NYT story:

By looking at virus samples gathered in Sierra Leone and contract-tracing data from Liberia, the scientists working on the new study estimated that about 70 percent of cases in West Africa go unreported. That is far fewer than earlier estimates, which assumed that up to 250 percent did.

It’s hard to see how the scientific community could have assumed 250% of cases were unreported. Mark Lieberman at Language Log looks at the research paper to find, firstly, that the ‘70%’ and ‘250%’ are the unreported cases as a fraction of the reported cases. That is, 70% unreported means that for every 100 reported cases there are 70 unreported, which one would usually call 41% unreported.  He also notes that 70% is the upper bound of a range estimated the paper, with the best estimate being 17% (that is, 17/117, or 14.5% unreported). What seems to have happened is that the word ‘underreported’ was changed to ‘unreported.’

What Language Log doesn’t look at is the transmission of these percentages.  There’s a story (and press release)at Yale News, home of most of the researchers, which has an intermediate mutation

Researchers were also able to estimate that for every Ebola case reported, fewer than one went unreported. This estimate, that up to 70% of cases were not reported, is significantly lower than previous estimates. “For Sierra Leone, underreporting is lower than some more speculative estimates that ran as high as 250%,” Townsend noted.

with ‘underreporting’ in the direct quotation and ‘unreported’ in the main text. From there, it’s easy to see how the distinction could have been tidied away at the NYT and at TheHill.com

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »