April 17, 2013

Drawing the wrong conclusions

A few years ago, economists Carmen Reinhart and Kenneth Rogoff wrote a paper on national debt, where they found that there wasn’t much relationship to economic growth as long as debt was less than 90% of GDP, but that above this level economic growth was lower.  The paper was widely cited as support for economic strategies of `austerity’.

Some economists at the University of Massachusetts attempted to repeat their analysis, and didn’t get the same result.  Reinhart and Rogoff sent them the data and spreadsheets they had used, and it turns out that the analysis they had done didn’t quite match the description in the paper.  Part of the discrepancy was an error in an Excel formula that accidentally excluded a bunch of countries, but Reinhart and Rogoff also deliberately excluded some countries and times that had high growth and high debt (including Australia and NZ immediately post-WWII), and gave each country the same weight in the analysis regardless of the number of years of data included. (paper — currently slow to load, summary by Mike Konczal)

Some points:

  • The ease of making this sort of error in Excel is exactly why a lot of statisticians don’t like Excel (despite its other virtues), so that has received a lot of publicity.
  • Reinhart and Rogoff point out that they only claimed to find an association, not a causal relationship, but they certainly knew how the paper was being used, and if they didn’t think provided evidence of a causal relationship they should have said something a lot earlier. (I think Dan Davies on Twitter put it best)
  • Chris Clary, who is a PhD student at MIT, points out that the first author (Thomas Herndon) on the paper demonstrating the failure to replicate is also a grad student, and notes that replicating things is job often left to grad students.
  • The Reinhart and Rogoff paper wasn’t the primary motivation for, say,  the UK Conservative Party to want to cut taxes and government spending. The Conservatives have always wanted to cut taxes and government spending. Cutting taxes and spending is a significant part of their basic platform. The paper, at most, provided a bit of extra intellectual cover.
  • The fact that the researchers handed over their spreadsheet pretty much proves they weren’t deliberately deceptive — but it’s a lot easy to convince yourself to spend a lot of time checking all the details of a calculation when you don’t like the answer than when you do.

Roger Peng, at  Johns Hopkins, has also written about this incident. It would, in various ways, have been tactless for him to point out some relevant history, so I will.

The Johns Hopkins air pollution research group conducted the largest and most comprehensive study of health effects of particulate air pollution, looking at deaths and hospital admissions in the 90 largest US cities.  This was a significant part of the evidence used in setting new, stricter, air pollution standards — an important and politically sensitive topic, though a few order of magnitude less so than austerity economics.  One of Roger’s early jobs at Johns Hopkins was to set up a system that made it easy for anyone to download their data and reproduce or vary their analyses. The size of the data and the complexity of some of the analyses meant just emailing a spreadsheet to people was not even close to acceptable.

Their research group became obsessive (in a good way) about reproducibility long before other researchers in epidemiology.  One likely reason is a traumatic experience in 2002, when they realised that the default settings for the software they were using had led to incorrect results for a lot of their published air pollution time series analyses.  They reported the problem to the EPA and their sponsors, fixed the problem, and reran all the analyses in a couple of weeks; the qualitative conclusions fortunately did not change.  You could make all sorts of comparisons with the economists’ error, but that is left as an exercise for the reader.

 

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Joseph Delaney

    “it’s a lot easy to convince yourself to spend a lot of time checking all the details of a calculation when you don’t like the answer than when you do”

    This is so true. And not just for the analyst but for the co-authors on the paper as well. Other authors get much more interested in sensitivity analysis and operational definitions when an unexpected association appears, and often get excited when the hoped-for association appears.

    11 years ago

  • avatar

    We just published a (minor) paper looking at historical changes in neurological disease naming in the scientific literature (at https://peerj.com/articles/67/ ). As a theoretical exercise in reproducible science, I also submitted a copy of the raw data and the R code used to generate, analyse, and display it. Because the data came from querying the PubMed database online, running the code meant that the data itself could be independently generated.

    I thought the process was all theoretical really, but within a day someone took the code, completely re-wrote it in (much more concise) PHP, collected the data independently and graphed it. Fortunately, the end results match ( Alf Eaton: https://docs.google.com/spreadsheet/pub?key=0AjePz4Y_bB3hdFJGZ0M2VGx3N2tlX3pIemhUSTZlbkE&gid=5 ).

    Ideally, posting data & code should really become the expectation rather than the exception. It’s not just for error correction/detection, but to allow for creative further analyses by others.

    11 years ago

  • avatar
    Duncan Hedderley

    In the grand tradition of “Titanic sinks: Local man safe”, Stuff claims it is all down to New Zealand (http://www.stuff.co.nz/business/industries/8574365/High-debt-low-growth-disproved )

    11 years ago