March 18, 2014

Your gut instinct needs a balanced diet

I linked earlier to Jeff Leek’s post on fivethirtyeight.com, because I thought it talked sensibly about assessing health news stories, and how to find and read the actual research sources.

While on the bus, I had a Twitter conversation with Hilda Bastian, who had read the piece (not through StatsChat) and was Not Happy. On rereading, I think her points were good ones, so I’m going to try to explain what I like and don’t like about the piece. In the end, I think she and I had opposite initial reactions to the piece from on the same starting point, the importance of separating what you believe in advance from what the data tell you.

Since this is the Internet, I probably need to start by pointing out I’m perfectly aware Jeff Leek knows how to read a scientific paper and understands Bayes’ Theorem, and I’m not suggesting otherwise. Also, before we go on, for reasons that will become clear later, it’s very important that you spend just a moment imagining a purple cow. Ok? Done? Good.

There’s one formula in the piece

Final opinion on headline = (initial gut feeling) * (study support for headline)

which is based on Bayes’ Theorem.

Considered as a statement about probabilities, this says you need more evidence to be convinced of more surprising things, and it’s pretty solid. As someone with whom I share a name is supposed to have said about a very surprising claim: “Unless I see the mark of the nails in his hands, and put my finger in the mark of the nails and my hand in his side, I will not believe.”

Considered as a reasoning technique, it says that you need to separate what you already believed from what you just learned, so that you don’t mistake previous conviction for evidence. That is remarkably difficult to do, so difficult that even simplified checklist-Bayesian approaches like the one Jeff advocates are a step forward in practical reasoning [for one introduction to checklist-Bayesian thinking, you could try the famous/notorious ‘Harry Potter and the Methods of Rationality‘. People who like that sort of thing will find it’s exactly the sort of thing they like].

Writing down your prior opinion in advance and using a consistent checklist every time is a good trick for not lying to yourself, and not giving weight to just the aspects of the evidence that make you happy.  You do actually have to do it every time, step by step, you can’t just have warm feelings about the concept of Bayes’ Theorem and get partial credit that way. For it to work, you can’t allow your assessment of evidence to be influenced by your gut instinct as to whether claim is true, and that means you can’t skip steps of the checklist because they are going the wrong way, and you can’t wait until after you’ve done the checklist to quantify your prior opinion.

The problems with the formulation of Bayes’ Theorem as given are, first, that you have to assess your prior belief after seeing the headline, when it may be too late;  second, not taking any account of the quantitative strength of evidence in the data; and third and most importantly, the phrase ‘gut instinct’.

The phrase was presumably chosen to emphasise that beliefs prior to seeing the data will differ between people, but it has other unfortunate ideas associated with it. Gut instinct is what you credit when you ignore the evidence and make a decision. “I know the doctors say we should vaccinate little Timmy but my gut says all those injections can’t be good for him.” Gut instinct is subject to confusing what you want to be true with what you should believe. My gut would like it to be true that raising the  minimum wage was a good way to tackle poverty. It makes sense as a story, and it’s more politically feasible than many other approaches. Sadly, I suspect that in NZ it isn’t true, even if it is in the US.

Subjectivist rational prior belief isn’t ‘gut instinct’ in the pejorative sense, and it’s much less personal than the phrase implies. There’s a set of theoretical results that says that two people who both use Bayes’ Theorem to update their beliefs in the light of evidence will have to end up with much the same beliefs after enough discussion.

In particular, and this, I think is the most important omission in the piece, your prior belief really should depend on the prior evidence. Your gut instinct needs a complete and balanced diet. That’s one of the most important points I’ve tried to make about medical stories on StatsChat: this story is not everything we know. That’s why it’s so great that everyone in New Zealand can access the Cochrane Library, to look at the summary results for all available randomised controlled trials on hundreds of topics.

Jeff’s presentation of Bayes’ Theorem also treats the question as true/false and assesses the evidence qualitatively. That’s clearly an oversimplification, but a fixable one. Whether it’s an important oversimplification depends on the situation.

For an example, consider the Women’s Health Initiative randomised trials. These provided moderately strong evidence that hormone replacement therapy increased the overall risk of serious disease in post-menopausal women by a small amount. A reasonable person could fail to be convinced of the increase in risk. On the other hand, the trial was so large that it provided compelling evidence that any possible benefit must be tiny, when many people had expected a large benefit (10-15% risk reduction). In that sort of situation the numerical values matter, not just true or false.

At the other extreme, if someone claims pomegranate juice produces big decreases in blood pressure and cortisol concentrations based on small uncontrolled unblinded study the real question isn’t ‘exactly how big?‘, it’s ‘should we be paying any attention to this at all?’.

The checklist is really designed for the latter case, which is a pity; while those headlines are much more common, they are not the ones with real implications for personal health decisions.

[oh, yes, the purple cow. That’s an example from the philosopher Daniel Dennett. Was the cow facing to your left or your right? If you can’t answer immediately, you probably didn’t really imagine a purple cow, you just did some sort of mental pause and went on. It’s not the same thing, just as picking a number out of the air and writing it down isn’t the same thing as assessing your prior beliefs.]

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

• I couldn’t answer immediately because the purple cow was staring right at me head on, and I was trying to assess whether she was pitching slightly left or right.

• Ich fühle, dass alles ist nicht gut, instinktiv.

• Thomas-

Saw your conversation with Hilda on Twitter. Thanks for leading with your faith in the fact that I both know Bayes’ theorem and can read a scientific paper. I think that is more credit then I generally get on the internet.

The piece had several major constraints which obviously make it difficult to teach an entirely nuanced view of paper reading. It had to be < 1,200 words, light on math, and readable by an only moderately technical audience.

Given those constraints my idea was to try to explain something simple that most people could apply regularly.

I think this deserves a longer post, which I'll put up at Simply Stats, but there is a tradeoff when speaking to a non-technical audience about a technical idea like evaluating papers.

On the one hand you can anger your core constituency (in this case, statisticians) and be willing to simplify the issue for a broad audience. By definition this requires glossing over some issues (when your prior is formed, whether it is subjective versus objective Bayesian inference, quantitative effect sizes versus decision-based inference). On the other hand you can alienate your broad audience but please your core constituency.

I thought it was a fun way to get people thinking about headlines in a statistical way. I'm not saying this is the one true path to scientific knowledge.

Jeff

• Jeff, this isn’t a small point that can be brushed away with, well, you can’t make all the statisticians happy. Firstly, I’m not a statistician. Secondly, prior knowledge isn’t such a complex term to communicate well – or even in a fun way. There are many other ways to make the point in a word or so.

But “gut instinct” is the lingua franca of “alternative ways of knowing” and “truthiness” (http://en.wikipedia.org/wiki/Truthiness), not rationality at all. To choose something that evokes, culturally, the opposite of the brain – whether it be heart or gut – is to encourage personal bias, which undermines the intent.

We are too easily accepting of that which confirms our preferences while massively critiquing & finding a reason to dismiss inconvenient results. A method that implies the opposite of that is the way to go is a pretty big problem.

Being aware of, and then overcoming, personal bias is a bigger challenge than the bias from study design or statistics.

• Hilda,

I think you’ll find if you read my piece carefully that this:

“We are too easily accepting of that which confirms our preferences while massively critiquing & finding a reason to dismiss inconvenient results. A method that implies the opposite of that is the way to go is a pretty big problem.”

Is exactly the opposite of what my piece advocates.

Jeff

• Jeff, I agree it is the opposite of the message you intend to send. I read it again very carefully, and what you intend and what is there are far apart.

“Gut feeling” is a powerful message that strongly diverts the direction of what’s there to “instinct” an “other ways of knowing,” not empirical “knowledge.” And it speaks only of “a study,” and taking only that into consideration along with feelings (aka prejudices).

Closer reading of what actually is “on the page” rather than assumed to be, seems to me to make my point stronger. So we’ll have to agree to disagree on this one. “Gut feeling” we have as uninformed lay people encountering a study on something outside our expertise is not, by any stretch of the imagination I can make, the sum of knowledge from prior empirical evidence.

• Thomas Lumley

In case we get any amateur or professional philosophers of statistics coming this way, a note on my methodological views. This is moderately technical; if you don’t understand it, you don’t care.

When it comes to statistical inference I’m a raving compatibilist — I think that it is unusual in practice and worrying in theory for a practically helpful approach to inference not to be approximately valid from both a Savage Bayesian and error-statistical viewpoint. Because I’m a biostatistician, and biostatistics leans frequentist, that is more likely to mean trying to construct subjectivist Bayesian decision-theoretic versions of tools I think are useful.

When it comes to philosophy of science I’m under-informed, but firmly in the error-statistical camp; you can find me roughly at the intersection (union?) of Susan Haack and Deborah Mayo. I certainly don’t think there’s any point in a theory that implies Newton had a prior distribution on cosmological B-mode polarisation.

For philosophy of mathematics I’m even more ignorant and very out of date, but Proofs and Refutations made more sense to me than anything else I’ve read.