April 25, 2020

Why New York is different

There have been three fairly large American seroprevalence studies recently.  These are studies that sample people from the community and test their blood for antibodies to the COViD-19 virus. Most people who have been infected, even if they recovered weeks ago, will have antibodies to the virus.  And people who have not been infected will not have antibodies to the COVID-19 virus, though the test isn’t perfect and will occasionally be confused by antibodies to something else.

The two studies in Los Angeles and in Santa Clara County (Silicon Valley) estimated that a few percent of people had been exposed to the virus. The New York study estimated far higher rates — in New York City itself, 21%.  The Californian studies have been widely criticised because of questions about the representativeness of the sample and the accuracy of the statistical calculations.  The New York study has had much less pushback.

One difference between the New York study and the others is that it’s not being pushed as a revolutionary change in our understanding of coronavirus, so people aren’t putting as much effort into looking at the details. Much more important, though, is that it is far easier to get a prevalence of 20% roughly right than to get a prevalence of 2-3% roughly right. If you make extraordinary claims based on a prevalence estimate of 2-3%, you need data and analysis of extraordinary quality (in a good way).  If your prevalence estimate of 20% is consistent with the other empirical data and models for coronavirus, it doesn’t need to stand on its own to the same extent.

Getting a good estimate of a prevalence of 2-3% is hard because the number of people who really have been exposed is going to be about the same as the number where the test gets confused and gives the wrong answer.  If you aren’t precisely certain of the accuracy of the test (and you’re not), the uncertainty in the true prevalence can easily be so large as to make the effort meaningless. On top of that, the quality of the sampling is critical:  even a little bit of over-volunteering by people who have been sick and want reassurance can drive up your estimate to be larger than the truth.  You can easily end up with an estimate saying the prevalence is much higher than most people expect, but only very weak evidence for that claim.

It looks as though the antibody test used in New York was less accurate than the one used in Santa Clara; the New York State lab that ran the testing says only that they are confident the rate of positive tests in truly unexposed people is less than 7%; their best-guess estimate will presumably be around 2-3%, in contrast with the best-guess estimate of 0.5% for test used in Santa Clara. But even, worst-case,  if 7% of tests were false positives, that still leaves 14% that were genuine. And since the test will miss some people who were truly exposed, the true prevalence will be higher than 14%. Suppose, for example, that the test picks up antibodies in 90% of people who really have been exposed. The 14% we’re seeing is only 90%80% of the truth, so the truth would be about 16%, and with a less-sensitive test, the truth would have to be higher.  So, even though the test is imperfect, somewhere between one in five and one in seven people tested had been exposed to the virus.  That’s a narrow enough range to be useful.  You still have to worry about sampling: it’s not clear whether sampling people trying to shop will give you an overestimate or an underestimate relative to the whole population, but the bias would have to be quite large to change the basic conclusions of the study.

The estimate for New York fits reasonably well with the estimate of roughly 0.1% for the proportion of the New York City population that have died because of the pandemic, and the orthodox estimate of around 1% for the proportion of infected people who will die of COViD.  These all have substantial uncertainty: I’ve talked about the prevalence estimate already. The infection fatality rate estimate is based on a mixture of data sets, all unrepresentative in different ways. And the excess mortality figure itself is fairly concrete, but it includes, eg, people who died because they didn’t try to get to hospital for heart attacks, and in the other direction, road deaths that didn’t happen.  It is still important that these three estimates fit together, and it correctly gives researchers more confidence in all the numbers.  The Californian studies imply only about 0.1% of infections are fatal, and that doesn’t fit the excess mortality data or the standard fatality rate estimate at all.

There’s an analogy that science is like a crossword1. But it’s not a cryptic crossword, where correct answers are obvious once you see the trick. It’s the other sort of crossword, where potential answers to different clues support or contradict each other.  If the clue is “Controversial NZ party leader (7)” and using “Bridges” would force another clue to be a seven letter word ending in “b”, you might pencil in “Seymour” instead and await further evidence.

1: Independently invented by multiple people including Susan Haack and Chad Orzel.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Joseph Delaney

    Yes, I think that this is precisely correct. If the CA studies had made more modest claims, in the paper and in the media, we would all have defaulted to “measuring a low prevalence is hard” and there would be a minor discussion about the test properties in the hopes of improving the inference.

    The NY study is a better sampling environment and continues to build up evidence for the relative virulence of covid-19. I am sympathetic to arguments that our response might be improved, mostly because it is unlikely that plans thrown together in a emergency happen to be optimal (even if they saved a lot of lives, we can try and puzzle out alternate plans).

    But this extremely tough public health problem is best made with the strongest evidence possible, and it was always going to be hard for prevalence studies in places like CA to inform this trade off much beyond confirming a relatively low rate of infection.

    4 years ago

  • avatar
    Steve Curtis

    There was an earlier seroprevalence study in Germany
    “virologist Hendrik Streeck from the University of Bonn announced preliminary results from a town of about 12,500 in Heinsberg, a region in Germany that had been hit hard by COVID-19. He told reporters his team had found antibodies to the virus in 14% of the 500 people tested.”
    However there are other issues about what they were testing and how it was carried out
    https://www.sciencemag.org/news/2020/04/antibody-surveys-suggesting-vast-undercount-coronavirus-infections-may-be-unreliable

    4 years ago

  • avatar
    Ramesh Nair

    It appears that both Californian studies used the ‘same test kit that was not FDA approved’ [ mercurynews.com ‘Feud over Stanford coronavirus study’ ]. This in itself is not a major problem, but it seems the Santa Clara study authors not only used a small sample size, but their estimate of the test’s false positive rate was based on the test kit manufacturer’s claims of its sensitivity and specificity. It would seem unlikely, especially in the current Wild West market of competing PCR and antibody tests , that any commercial operator would overstate the inaccuracy of their test kits. Naturally, when all these test kits are new and not validated, any serious population study would not only have used a larger sample size, but also deliberately withheld from use a large number of the test kits from their batches. These test kits that were unused ought then have been independently lab tested to determine their true sensitivity and specificity, independent of the manufacturer’s claims.

    4 years ago

    • avatar
      Thomas Lumley

      The main constraint on test validation isn’t having enough tests used for validation, it’s having enough known-negative *samples* to validate. Most people don’t have an unlimited pool of pre-COViD blood samples lying around. I know that’s been a constraint for test validation in New Zealand.

      4 years ago