June 19, 2011

The misuse of DNA statistics

From the NZ Herald:

CIA personnel there compared it “with a comprehensive DNA profile derived from DNA collected from multiple members of bin Laden’s family,” the statement said. “The possibility of a mistaken identification is approximately one in 11.8 quadrillion.”

This is a common misreporting of DNA statistics and it highlights the confusion regarding evidence interpretation. The figure, 1 in 11.8 quadrillion, quoted in the CIA statement is known as a random match probability. It answers a specific question. In this case the question is, “What is the probability someone else has this profile, given what we know about the alleged victim’s (bin Laden) DNA profile, and the profiles of his extended family?” Note that this is a very different question from what is the probability that this DNA comes from someone other than Mr bin Laden?”

This is a very common mistake, so common in fact that it has a name, the Prosecutor’s fallacy. The fallacy usually relates to a misunderstanding regarding conditional probability.

In this case it is far more likely that the DNA analyst calculated a likelihood ratio. The likelihood ratio compares the probability of the evidence under two competing hypotheses. In this case sensible hypotheses might be, Hp: the body is Mr bin Laden and Hd: the body is someone unrelated to Mr bin Laden. The correct statement would be “The (DNA) evidence is 11.8 quadrillion times more likely if the body is Mr Bin Laden rather than if the body belongs to someone other who is unrelated to Mr bin Laden.” This is a statement about the evidence not about the hypotheses.

It is possible to give a statement regarding the hypotheses, but in order to do this we have to have some prior probabilities associated with them before we consider the evidence. The statistical formula that allows us to reverse the probability statements is known as Bayes’ Theorem.

Do I think the body belongs to someone other than Mr bin Laden? No, but I do think there is an obligation to use statistics correctly.


James Curran's interests are in statistical problems in forensic science. He consults with forensic agencies in New Zealand, Australia, the United Kingdom, and the United States. He produces and maintains expert systems software for the interpretation of evidence. He has experience as an expert witness in DNA and glass evidence, appearing in courts in the United States and Australia. He has very strong interests in statistical computing, and in automation projects. See all posts by James Curran »