L.A. Times Corrects the Most Trivial of Three Errors From Its Article on DNA, Statistics, and Cold Hits
Recently, I pointed out three errors in an L.A. Times article on DNA, statistics, and cold hits (see here and here). Two were substantive and one was a trivial instance of the newspaper turning a fraction upside down.
Guess which one they are correcting?
DNA evidence: A May 4 article in Section A about the statistical calculations involved in describing DNA evidence in a murder case contained an arithmetic error. It said that multiplying the probability of 1 in 1.1 million by 338,000 was the same as dividing 1.1 million by 338,000. Actually, it’s the same as dividing 338,000 by 1.1 million. The answer, a 1 in 3 probability of a coincidental match between crime scene DNA and genetic profiles in a state database, was correct.
Yes, that is the trivial error.
Congratulations to Xrlq’s Aunt Ruth for noting it and bringing it to my attention.
But I am very, very disappointed that the paper is leaving two far more substantive and significant errors uncorrected. To recap, here was the first error:
Jurors were not told, however, the statistic that leading scientists consider the most significant: the probability that the database search had hit upon an innocent person.
In Puckett’s case, it was 1 in 3.
The reporter tells me:
In our story, we did not write that there was a 1 in 3 chance that Puckett was innocent, which would be a clear example of the prosecutor’s fallacy. Rather, we wrote: “Jurors were not told, however, the statistic that leading scientists consider the most significant: the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.” The difference is subtle, but real.
I fail to see any difference whatsoever.
The key fact: the hit to Puckett was the only hit that occurred. So when the article says there was a 1 in 3 chance that the search “had hit” on an innocent person, it is describing the chance that the hit to Puckett was a hit to an innocent person.
This is indeed the same as saying that there was a 1 in 3 chance that Puckett was innocent — which the reporter admits is inaccurate.
I believe the article meant to say this: if the database had consisted only of innocent people, there was a 1 in 3 chance that the search would hit on an innocent person. Phrased that way, the statement would be accurate, and would shed light on the question of how surprised we should be by a database hit.
But that’s not what the paper said. Instead, the article indicated the odds that the search “had hit” on an innocent person — in other words, the odds that Puckett himself was innocent.
By the way, the reporter indicated in an e-mail to me that he believes commenter Xrlq agrees with him on this point. He should read this post, in which Xrlq says that the 1 in 3 number as expressed by the paper is “almost certainly wrong.”
The second error was this passage:
Because the match in Puckett’s case involved only 5 1/2 genetic locations, the chance it was coincidental was higher but still remote: 1 in 1.1 million.
This is a classic example of the “prosecutor’s fallacy.” [UPDATE: I have a professor telling me it's more accurately called the "transposition fallacy".] The paper took a number meant to express the generalized odds of an event occurring, and used it to express the chance that a particular occurrence was a coincidence.
If there’s a 1 in 100 million chance of winning the lottery, you can’t say of the winner: “the chance that his win was coincidental was 1 in 100 million.”
That makes it sound like he was certain to win. But until he did win, he was almost certain not to.
Again, I have a good idea what the paper meant to say. But it’s not what they actually said.
The issue here is not the math. It’s about the proper way to express the math in English. It’s a tricky thing to do, but the fact that it’s tricky doesn’t excuse a failure to correct misleading language.