Patterico's Pontifications

5/30/2008

Read This Post Even If You Have Been Confused By (Or Put Off By) the Posts on DNA, Statistics, and Cold Cases

Filed under: Dog Trainer,General — Patterico @ 7:15 pm



I’m taking a survey, and I want maximum participation. You don’t have to understand the ins and outs of statistics. This is a simple and straightforward question, having to do with the meaning of the English language.

Assume police have a DNA sample. They run the profile through a DNA database. They get one match: to a man named Puckett.

The L.A. Times then writes:

The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

My assertion: the paper is saying there was a 1 in 3 chance that Puckett was innocent.

The L.A. Times says I’m wrong. They say there is another way to read that language — but they won’t tell me what it is. (Yes, I asked.)

I say I’m right. This is the only way to read that passage.

What say you? Am I right or wrong?

Please begin your comment with the single declarative statement: “You’re right” or “You’re wrong.” Then explain away to your heart’s content.

If you say I’m wrong, please give me an alternate explanation of that quoted passage, that is consistent with the facts provided: a database search resulting in only one hit, to a man named Puckett.

Please, no discussions about whether the 1 in 3 number is actually right or wrong. This is a simple question about what the paper said, and whether it’s ambiguous or clear. I want to keep the question that simple.

I claim the quoted language can be read only one way: as an assertion that there is a 1 in 3 chance Puckett is innocent. Am I right or wrong?

169 Responses to “Read This Post Even If You Have Been Confused By (Or Put Off By) the Posts on DNA, Statistics, and Cold Cases”

  1. You are right.

    The sentence is unclear, but “it” seems to refer back to its antecedent of hitting upon an innocent person; the intent is to imply that there was a 1 in 3 chance he was innocent.

    BTW I edit scholarly writing, and I would tell this writer to change the sentence to eliminate the vagueness of “it.”

    Patricia (f56a97)

  2. You’re right. Since they only hit on Puckett, it stands to reason that if the odds are 1 in 3 that they hit on an innocent person, then the odds are exactly the same that Puckett is that innocent person. That said, the real error here lies in them having said there is a 1 in 3 chance that anything happened in the past. Even if the two statements were different, the Times’s response would boil down to “we didn’t draw the silly, unfounded fallacious conclusion you said we drew. We drew a different silly, unfounded, fallacious conclusion, dammit.”

    Xrlq (62cad4)

  3. You’re right. That’s the only interpretation I would make of that statement.

    Stashiu3 (460dc1)

  4. you are right.

    With a minor edit that does not affect the concept, it proves you 100% correct.

    …the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    ira (b29219)

  5. you are right.

    With a minor edit that does not affect the concept, it proves you 100% correct.

    …the probability that the database search had hit upon an innocent person was 1 in 3.

    removed – “. In Puckett’s case, it”

    (Sorry for the dupe. Strike didn’t work)

    ira (b29219)

  6. You are right.

    In fact, I don’t see any other way to understand the quote that does not involve drastic redefinition of essential terms like statisic, innocent, etc.

    kishnevi (3b614c)

  7. You are right. The first sentence says chance of a hit on an innocent. The second says that chance was 1 in 3… Therefore, using their numbers, there’s a 33% chance he’s innocent, a 66% percent chance he’s guilty…

    kimsch (2ce939)

  8. You are right.

    Are there any decent online tools for sentence diagramming?

    Al (b624ac)

  9. I think you’re right, Patrick… It sounds like a 1 in 3 shot he’s been wrongly “hit” in the database…

    Scott Jacobs (d3a6ec)

  10. You’re right.

    Seems fairly clear to me what the passage is meant to convey.

    Gus (b5d255)

  11. You are right. There is no other way to read that simple declarative sentence. They are saying that in Puckett’s case the probability of a false positive was 1 in 3

    Not a Yank (b029e6)

  12. You are 1/3 right, according to the times. I’m assuming they think that there are possibly two other ways to interpret their statement. Since those other two ways are “secret”, one must assume a 1/3 chance of error for each. This gives us 1/3 x 1/3, or 1/9 chance that their error isn’t incorrect. Using the law of double negatives (which I just made up), there cannot be a non-answer to an unasked “secret” non-question.

    I’d tell you my reasoning behind all this, but it’s secret.

    Apogee (366e8b)

  13. You’re right.

    no one you know (1ebbb1)

  14. You’re right.

    They are grammtical idiots at the LAT, and won’t provide you with the alternate meaning because there isn’t one. They are covering their rear ends.

    DoctorHook (ad11d8)

  15. “You’re right”

    I would have to study under the Clintons to find another interpretation.

    Arthur (f6a9bf)

  16. We need to wait for SEK to visit to tell us how to correctly interpret the information. That’s my story and I’m sticking to it.

    daleyrocks (c4f06f)

  17. You’re right.

    If the Times ever tells you the other way to read that sentence, please share it with all of us.

    Steverino (a7584b)

  18. I’ve read that thing til I can’t think straight.
    If there’s a one in three chance you could hit on an innocent person, wouldn’t that mean the other two are guity?

    jimboster (364ef3)

  19. You’re right. You properly emphasized the past tense to show that the LAT was talking about working backwards from the position of knowing that exactly 1 hit was returned from the DB.

    Daryl Herbert (4ecd4c)

  20. You’re right-
    or they are horrible writers. Possibly both. They *could* be right if the stat was 1 in whatever of any hit, and they made the assumption that the bad guy would be there, and THEN got the 1 in three of an innocent.

    Foxfier (15ac79)

  21. jimboster,

    I hate to break the beautiful streak of comments starting with that sublime phrase “You’re right” — but you have a good point. The way the LAT puts it, they actually claim there is a 2/3 chance Puckett is guilty, based on the statistics alone. That is nonsense, as you can’t possibly make any such assertion on the statistics alone.

    (This means, ironically, that I am showing their position to be inordinately pro-prosecution! But I have always maintained that my first devotion is to accuracy, and not to a particular agenda.)

    And with that, I return to a strict enforcement of the rule that we not discuss whether the 1 in 3 statistic is correct. The LAT reporter admitted one formulation would be incorrect. He claimed he said something else. A parade of commenters is disputing his assertion. Let the parade continue!

    Patterico (cb443b)

  22. You’re right. I can’t see another way to understand the sentence, even with the “newspaper + math –> error” axiom.

    htom (412a17)

  23. You’re right!

    The L.A. Times says I’m wrong. They say there is another way to read that language — but they won’t tell me what it is. (Yes, I asked.)

    That brings back memories of my childhood and my older brother’s response to a question.
    “It’s for me to know and you to find out!”

    atxcowgirl (d3cf50)

  24. You’re wrong. (based only on the sentence quoted.) [Patterico notes: Joe is not answering the question I asked. More in a comment below. — P]

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.”

    Let me illustrate:

    There are 2 coins both showing tails on the table. The odd that they all showed tails by chance are 1 in 4. (1/2^2)

    There are 5 coins all showing tails sitting on the table, the odds that they all showed tails by chance is 1 in 32 (1/2^5)

    There are 500 coins all showing tails sitting on the table, the odds that they all showed tails by chance is 3200 (1/2^500)

    I don’t see how the sentence quoted says much about his guilt or innocence.

    Also, DNA seems like a more powerful tool for eliminating suspects than including them.

    Joe (c0e4f8)

  25. Shoot, should read the odds that they had all…

    Joe (c0e4f8)

  26. God only knows what they’re saying. The only thing the hit proves is that there was a person with that genetic profile in the database. Certainly guilty of having being put in the database. Nothing else.

    nk (6c75e0)

  27. P,
    You are right.

    V

    Vmaximus (4c3f33)

  28. Joe, Joe, Joe.

    Did you read the instructions?

    We’re not talking about whether the assertion is accurate. We’re talking about one issue, and one issue only: did the LAT passage mean what I claim it must have meant?

    If you want to say I’m wrong, that’s fine — all you have to do is provide an alternate explanation of the meaning of what they said.

    Instead, you seem to be challenging the math. I VERY CAREFULLY EXPLAINED that I was not interested in discussing the correctness of the math.

    So, you get an “F” for reading comprehension. But you’re welcome to try again.

    Patterico (cb443b)

  29. nk,

    We’re not talking about what the hit proves.

    So please, don’t talk about that.

    Just tell me whether my interpretation of their English is correct or not.

    You’re smart enough to do that.

    Patterico (cb443b)

  30. You’re wrong:

    The LA Times could mean that the DNA database walked up to Puckett in a bar and made a pass at him.

    Leviticus (22eefa)

  31. Heh.

    Patterico (cb443b)

  32. You’re right. If I am on a jury and that is the language in the judge’s charge/instruction, I would heavily discount the validity of the DNA evidence.

    I would also insist on further clarification.

    Ed (532eeb)

  33. You are right.

    I know we’re not supposed to talk about whether or not the 1 in 3 number is right or wrong, but it could be right. Puckett may be innocent of something. Not the crime they are investigating at the moment, but I’d say there is a 1 in 3 chance he is innocent of something or other. Or perhaps Puckett is one of a genetically engineered set of triplets of which two were involved in the crime. So many plausible explanations..

    CT LEO (e68882)

  34. I’m too smart. I’m filling in the blanks for them whether I want to or not.

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent [or guilty] person.

    In Puckett’s case, it was 1 in 3.”

    I don’t know if you’re right or wrong.
    They’re morons. What argument can you give at their level other than my previous comment? How does one interpret “Gaga googa, gaga didda”? Which is what they essentially said.

    nk (6c75e0)

  35. no more statistics posts!

    assistant devil's advocate (c34fd2)

  36. You’re right. There is only one sane way to interpret their sentence, and you have it.

    John Cunningham (1cb7c8)

  37. ADA, this isn’t a statistics post. It’s an English-language post.

    Since you misread Patterico’s post, you fail the English language.

    Daryl Herbert (4ecd4c)

  38. You’re clearly right. As has been said, the “it” in the second sentence can only refer to the probability that the database search had hit on an innocent person. This is simple English here. It’s obvious why the Times isn’t giving the alternate interpretation. There isn’t one.

    Alan (0cf397)

  39. You are right. So now you have international (Canadian, in my case) agreement, too.

    ras (fc54bb)

  40. Your Right

    Dr T (340565)

  41. as opposed to your left
    PIMF

    Dr T (340565)

  42. You’re right.

    At least I can’t come up with another way to read the sentence, and I’ve had way too much experience trying to understand what students were trying to say when describing probabilities.

    Mark (005214)

  43. You are right, and I fail to see how it could be interpreted any other way. It has been years since I’ve tried to diagram sentences and I haven’t got a clue on how to do it with a keyboard, but when you break the first sentence down, the subject is statistic. It is followed by clauses telling us it is considered an important statistic, and that leading scientists are the ones who consider it important. The predicate is probability, followed by a couple more clauses that tell us what the word probability is about. Sorry, but it has been too many years for me to remember what the correct terms are for each of the clauses since I haven’t diagrammed sentences since I was a freshman in high school in 1957. Anyhow, probability is modified by clauses to tell us it is about a database search and the chances of hitting an innocent person with that search. The fact that another sentence immediately follows that tells us that in Puckett’s case it is one in three can only be applied to the previous sentence and so they clearly said the chance of hitting an innocent person in that database search was one in three when using Puckett’s DNA. They can argue what they meant all they want, but unless they were willing to diagram it another way and justify said diagramming to prove it said something else, they are talking through their hats.

    Having said all that, why don’t you send it to James Kilpatrick and ask him. He’s an old time newspaper guy who loves to write about and parse words and his email address and columns can be found at Yahoo by going to opinion and checking out columnists. To save you the trouble it is kilpatjj@aol.com, but I included the other information for those who might like to read his columns. He has very graciously responded to my emails on the three occasions I have written him, the last being less than two weeks ago. As I recall he has worked in the newspaper business something on the order of 70 years as a writer and editor. You may even remember him because for years he covered the Supreme Court cases which he recently stopped doing.

    Fritz (96f64d)

  44. You’re wrong.

    The reason I say that you are wrong is because the two sentences are non-sequitors, in my opinion, and do a horrible job of explaining the situation. After reading it several times, I am not sure what they are trying to say. Because the writing is so poor, you could probably come up with a dozen explanations for what it said.

    Here are some statements about the case that I think are true:

    The odds of the criminal’s DNA taken from the scene being matched to me through this database are zero (I am not in the database there is no chance of a match.)

    The odds of the criminal’s DNA from the scene to being a match to a record in the DNA database are 1 in 3.

    The odds of my DNA being a match to a record in the DNA in the database are 1 in 3.

    The odds of your DNA being a match to a record in the DNA in the database are 1 in 3.

    Once you name an inidvidual in the database, the odds go back to very large:

    If there is a hit, the odds are one in 300,000+ that it is Puckett.

    The odds of the suspect’s DNA being a match to Puckett’s are 1 in 1.1 million.

    Personalizing it allows investigators to bring other tools beyond a statistical hit into the mix. For example,if Puckett were in jail at the time of the crime, or giving a speech, or physical characteristics ruled him out, you can cross him off the list of suspects.

    One way to think about statistics is exclusion, not inclusion. You eliminated every one in that database except Puckett as possible suspects. That does not make Puckett the only suspect-but it makes him the only suspect out of the population in that database.

    MartyH (fd100c)

  45. MartyH

    The reason I say that you are wrong is because the two sentences are non-sequitors, in my opinion, and do a horrible job of explaining the situation. After reading it several times, I am not sure what they are trying to say. Because the writing is so poor, you could probably come up with a dozen explanations for what it said.

    I asked people to come up with one. You came up with zero.

    For purposes of this post, I don’t care about your opinions about the relevant statistics. There are plenty of other posts where you can discuss those matters. This is not that post.

    Your claim that I am wrong carries absolutely no weight until and unless you can give me ONE plausible explanation of the language that is a) inconsistent with what I claim it says yet b) consistent with the actual language of the article.

    Come on. If there are a dozen plausible explanations, give me ONE.

    Patterico (cb443b)

  46. you’re right and the Times has some piss poor writers….. if they meant anything else, they should have constructed the sentence differently.

    as we say on Usenet:
    “Some people have a great command of the English language. A certain percentage of them, however, tend to lead it into suicidal charges.”

    redc1c4 (e3877d)

  47. You are wrong as I read it without knowing any other details. I read it as following these logical steps:

    They put the dna into the database.

    The database returned one match.

    There is a 1 out of 3 chance that that return isn’t guilty of the crime for which it corresponded. [Patterico sez: j curtis here says I’m right. He just didn’t read the post carefully enough. — P]

    There is no reason for me to believe otherwise without any other information to go on. I can only take the assertion of the “leading scientists” at face value.

    j curtis (c84b9e)

  48. Why can’t people read the post?

    j curtis, you just said the same thing Patterico does about that sentence, which means you agree with how he reads it. So, does the sentence as written mean that there is a 1 in 3 chance that Puckett was innocent? That’s what Patterico says it means and the LAT says it doesn’t, but they won’t explain why or what else it could mean. If you think Patterico is wrong, explain how else that sentence could be read that is inconsistent with what he claims it says, yet consistent with the actual language of the article.

    Stashiu3 (460dc1)

  49. There is a 1 out of 3 chance that that return isn’t guilty of the crime for which it corresponded.

    Put another way, there is a 1 in 3 chance Puckett is innocent.

    So, you think I’m right. You just apparently didn’t pay attention to the actual question I asked.

    Patterico (cb443b)

  50. Patterico-

    I don’t know what the hell the Times is trying to say in those two sentences. So I really can’t say that you are right.

    [Patterico adds: Please see comment 97, where this commenter acknowledges that I am right.]

    I’ll reword the paragraph in a way that makes sense to me. You decide if you can get to my explanation from their words.

    LA Times:

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.”

    MartyH:

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. For example, dear reader, if we took your DNA and compared it to the DNA database, there is a one in three chance that there would be a match to someone in the database.”

    Or how about:

    “The statistic that leading scientists consider the most significant is the probability that the database search will result in a random match. For example, if we exhumed a random body buried a hundred years ago and compared its DNA to the living subjects in the database, there is a one in three chance that there would be a match to someone in the database.”

    Or how about the negative condition that I mentioned in my post:

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. A random DNA sample submitted will not match any one in the database 2/3 of the time. If there is a match, further work must be done to determine if this is the one time in three that there is a random match, or if the person so identified is guilty.

    I think the LAT wording has some huge holes. I trid to fill in those holes with a coherent explanation that is not yours.

    MartyH (fd100c)

  51. Marty H, I am going to quote you here and just put bold tags around a couple of words:

    LA Times:

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.”

    MartyH:

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. For example, dear reader, if we took your DNA and compared it to the DNA database, there is a one in three chance that there would be a match to someone in the database.”

    See, what you’re doing is trying to take the LAT wording, which makes no sense, and turn it into SOMETHING DIFFERENT that does make sense.

    But I’m not asking for something that makes sense. I’m just asking for what they said. Do you understand?

    You have not offered an interpretation of what they actually said, because it uses verb tenses that aren’t consistent with the language they used. You’re just trying to reword their contentless crap sentence into something that you can recognize as bearing some resemblance to accuracy.

    Forget the accuracy of their assertion. Yes, it makes no sense. Fine. What did they actually SAY?

    Patterico (cb443b)

  52. A match doesn’t confirm guilt and there wasn’t anything random. But that’s not what the post is about.

    Anyway, here is how I would re-word it: “The Times asserts that according to the statistic that leading scientists consider the most significant, in Puckett’s case there is a 1 in 3 chance that the DNA match (Puckett’s) was to an innocent person.”

    If there is any other way to parse that and remain consistent to the original wording, yet come out with a different meaning, I can’t think of it.

    Stashiu3 (460dc1)

  53. You’re right. But I think you’re confusing the grammatical meaning of the sentence with its logical consequences. Gramatically, I think one could argue that there is a distinction between the database search and Puckett. The 1 in 3 statistic refers to the database search, not to Puckett’s guilt or innocence.

    Logically, we then ask: is it possible for the database search to have a 1 in 3 chance of hitting on an innocent person, without thereby implying that there is a 1 in 3 chance that Puckett is innocent? I don’t see how. Unless they can come up with a way to avoid this conclusion logically, they were saying, in essence, that there is a 1 in 3 chance that Puckett is innocent.

    So while they could say that the 1 in 3 statistic applies to the database search and not to Puckett, logically it ends up the same. Therefore, again, you’re right.

    Tragic Clown Dog (171743)

  54. Gramatically, I think one could argue that there is a distinction between the database search and Puckett. The 1 in 3 statistic refers to the database search, not to Puckett’s guilt or innocence.

    Huh? The database search hit on Puckett. Gramatically, I don’t see how you could separate the two, because the article refers to the chance that it “had hit” on an innocent person.

    I know you say, ultimately, that I am right — I just don’t understand the quoted passage. Gramatically, it’s a reference to what happened: the search “had hit” on Puckett.

    Patterico (cb443b)

  55. Not one person has offered a plausible alternate reading. So far, the only people to say I’m wrong appear to be people who were a) being sarcastic, or b) didn’t understand what I was asking and can’t supply a plausible alternate reading.

    This is what I thought would happen. Is anyone at the LAT reading this? Is anyone embarrassed?

    Patterico (cb443b)

  56. Gramatically, I don’t see how you could separate the two, because the article refers to the chance that it “had hit” on an innocent person.

    That’s a logical point, not a grammatical point. Logically affirming one entails the affirmation of the other. Gramatically, I think they could argue that the 1 in 3 statistic is referring to the clause “database search” unless they explicitly identify this with Puckett’s guilt or innocence (by saying something along the lines of “this means that there is a 1 in 3 chance that Puckett is innocent”).

    Now I really doubt they thought it through, and were making such a distinction when they wrote it. Obviously, they were saying there is a 1 in 3 chance that Puckett is innocent. I’m just suggesting that one could (not that one must) make such a distinction on a grammatical level — always ignoring that logically it ends up meaning the same thing.

    Tragic Clown Dog (171743)

  57. You’re right, I can’t think of any other way to interpret it.

    Ken from Camarillo (aa2192)

  58. I wouldn’t be willing to baldly say you’re wrong, but I thiiiiiink you might be. Maybe.

    I think you are making a logical leap that the LAT doesn’t make — in fact, *carefully* does not make. I *think* the “fallacy” the reporter referred to (in the earlier exchange with you) was the extrapolation that because the odds were 1 in 3 that the hit would be an innocent person, therefore the odds were 1 in 3 that Puckett was innocent.

    I do not in any way feel on terra firma here, though.

    Not Rhetorical (5f6368)

  59. I can’t supply an alternative wording because I don’t think there *is* an alternative wording. It is what it is — I think.

    Not Rhetorical (5f6368)

  60. because the odds were 1 in 3 that the hit would be an innocent person, therefore the odds were 1 in 3 that Puckett was innocent.

    “would be” is not what the article said.

    “had hit” is what it said.

    Words mean things. The article discussed the chances that the search “had hit” on an innocent person.

    Patterico (cb443b)

  61. So you have to call it an “extrapolation” that because the odds were 1 in 3 that the hit had hit an innocent person, therefore the odds were 1 in 3 that Puckett was innocent.

    That’s not an extrapolation; it’s a tautology. Because of the verb tense.

    Patterico (cb443b)

  62. Ohhh, I see. I kept wondering why “had” was bolded — I see now that the distinction has to do with verb tense.

    I have to say: I think it’s possible that many — maybe most — of the people who agree with you in this thread would still “agree” with you even if the sentence had used “would” instead of “had.” Even though you seem to think the sentence would be defensible if it said “would.”

    Not Rhetorical (5f6368)

  63. I posted while you were posting. Sorry for the repetition.

    Not Rhetorical (5f6368)

  64. I don’t want to use the term “defensible” in this post because I’m not talking statistics. I’m talking meaning.

    If it had said “would” then one could argue that there was a 1 in 3 chance that a search WOULD hit on an innocent person — but that doesn’t necessarily say anything about what DID happen.

    The second you say there was a 1 in 3 chance that the search HAD HIT on an innocent person, you are inevitably talking about what DID happen — i.e. about the hit to Puckett.

    This is what Mr. Felch isn’t getting. His article said “had hit.” He admits that he can’t properly say there’s a 1 in 3 chance Puckett is innocent. But he doesn’t seem to understand that that is EXACTLY what he DID say by using the “had hit” phraseology.

    Patterico (cb443b)

  65. Now: say “You’re right.” 🙂

    Patterico (cb443b)

  66. You’re right. Now post a new survey, substituting “would” for “had” in a fake LAT quote, and see if you can still get 50 comments agreeing that the fake quote says there’s a 1 in 3 chance Puckett is innocent. I’ll bet you can!

    Not Rhetorical (5f6368)

  67. I shoulda put a smiley face on that. :o)

    Not Rhetorical (5f6368)

  68. You’re absolutely correct.

    Here’s why I believe so:

    I will break the sentence down into “motivational phrases” or the way arguments are intended to persuade (having taken argumentation and persuasion as individual classes, there IS a “science/art” to phrasing, pacing, word choice, sentence structure that has at its base, the intent to move the listener/reader to come to the author’s conclusion conclusion.

    The statistic that leading scientists consider the most significant

    Here we open with “respected authority”. The phraseology is to insert that “respected authority” between the words “statistic” and “most significant” achieves two things.

    It spotlights “leading scientists” and weakens the fact that it is a “statistic”. It also delays the “finding” by these “leading SCIENTISTS”…UNTIL AFTER THEIR INTRODUCTION.

    Even though the logic that is about to be introduced is about the “statistic” itself, it can’t be introduced until it has backup authority.

    is the probability that the database search had hit upon an innocent person.

    So, based upon the way the “setup” to the argument has been made…we NOW are told that “leading scientists” think that the “probability that the database search had HIT ON AN INNOCENT PERSON”…is the most significant.

    1)The first though fragment that the listener/reader will capture…is “LEADING SCIENTISTS” the brain will then attach [single most important statistic] to the gravity of their “scientific” position.

    You are no longer arguing with the author, you would be taking on “leading scientists” [and, inherent in the message… you won’t have the ammo to do so]

    In Puckett’s case, it was 1 in 3.

    Now we introduce the actual conclusion they wish you to reach, based upon the unassailable “expert authority” which has been spotlighted. Puckett was framed.

    the brain will make the following adaptations:

    “Leading SCIENTISTS, all come to the same conclusion, Puckett was convicted on evidence that was so weak…he was likely to be one out of 3 people who were innocent and wrongfully prosecuted. He was the unlucky ONE (out of three) in this instance.

    Anyone who plays golf knows that the brain filters some words out. “Don’t hit it in that water on the left…the “don’t” is filtered out.

    The leftist media has this game down pat. They do it on TV and in the deadwood media all day, every day. When you go back and isolate their language…they have “plausible deniability” based upon the ambiguous nature of the language quite often.

    But the intent is clear. You can’t make this many “mistakes”…all with the same narrative over and over and continue to be believed that it is an accident or coincidence…pure innocence.

    Not when there is an entire body of study on the subject which outlines how to do EXACTLY what they are doing each and every time.

    The INTENTION here…is to lead the listener/reader to the unassailable conclusion that Puckett not only was the poor sucker who HAPPENED to be the innocent one out of 3 statistic…but that it was inevitable that 33% of ALL accused parties are one of three “innocents” who get prosecuted.

    Not only is there no other way to reasonably read the sentence…the classroom teachings in this field show this to be a classic example of how to build the sentence structure and present the reader with no other conclusion.

    Not only are you right, Pat…but as a bonus…if we pulled up the syllabus and examples from the classes …leading experts in the field would find that the chances that this was intentional are more than 99/100.

    cfbleachers (4040c7)

  69. Your right.

    tyree (c2467a)

  70. You’re still wrong,

    maybe this is more clear
    “The statistic that leading scientists consider the most significant is the probability that the database search coins had turned up that way on chance alone had hit upon an innocent person. In Puckett’s case who cares, it was 1 in 3 whatever.

    that’s a statement of fact about the process. it doesn’t say if it’s random or not.

    also

    1. the odds that the match is in error are not the odds that the guy is innocent or guilty. if we know the guilty person left the DNA it can prove that he didn’t do it by showing there is no match. But unless the odds of a mismatch are near 0 it doesn’t show that they’re guilty. I don’t know anything about this case so that might not matter.

    [Joe, I can’t really understand what you’re saying here, but it seems clear to me that you still don’t understand the question I’m asking. It doesn’t have to do with the correctness of the statistics. It has to do with what the L.A. Times is actually saying. — P]

    Joe (c0e4f8)

  71. shit, none of my tags worked.

    “The statistic that leading scientists consider the most significant is the probability that the database search coins had turned up that way on chance alone had hit upon an innocent person. In Puckett’s case who cares, it was 1 in 3 whatever.

    Joe (c0e4f8)

  72. strike still isn’t working. one more try

    “The statistic that leading scientists consider the most significant is the probability that the coins had turned up that way on chance alone. In who cares, it was whatever.

    strike worked in preview.

    Joe (c0e4f8)

  73. You’re right, mostly. It doesn’t explicitly say that but that is the extremely strong inference and I can’t imagine what else it might be trying to say or how else it might be reasonably interpreted. So, the degree to which you could be wrong is the same degree to which the author is an incompetent writer, unable to express himself with any clarity. If you’re wrong, he should be writing classified ads or grooming dogs or something.

    Joe, you’re still arguing statistics and/or the accuracy of the statement. That is not the question. What does the excerpt say? Your reframing of it makes Patterico’s point.

    Pablo (99243e)

  74. You are right.

    Paula R. Robinson. M.D. (f67c2f)

  75. You are right. What is there to discuss? The wording in the sentence referenced, is unambiguous.

    Edward Lunny (85f233)

  76. You are right. The way the sentence reads, there is a 33% chance that this match in the database is of an innocent person. It implies that other matches may be the same or possibly even worse.

    rudytbone (93b68e)

  77. You’re right. I agree now.

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. [i]n Puckett’s case, it was 1 in 3.

    There’s no other reasonable way those two sentences could be parsed.

    nk (6c75e0)

  78. “You’re right.” Whether sloppiness or what at the LAT, the article was intended to convey that the police state under Bush is railroading innocents on the basis of faulty evidence.

    EW1(SG) (84e813)

  79. There’s no other reasonable way those two sentences could be parsed.

    They could be incoherent. 🙂

    Pablo (99243e)

  80. Well, actually, that’s the same as saying that the chance that Puckett is guilty is 2 out 3 or, even, that 2 out of 3 times you will find the guilty person in the database. That’s what made my head spin when I first read it.

    nk (6c75e0)

  81. You’re Right.

    Charles Harkins (971090)

  82. Of Course you are right. Talk about an agenda. A new low in misleading the public about the criminal justice system.

    Infidel (407246)

  83. You’re right; there’s simply no other way to read what they’ve written.

    Tim K (2bb72a)

  84. Joe,

    Use “strike” and “/strike” instead of the “s” and “/s” that the tag puts in and you will get a strikethrough. Then, read the post and the question before answering again. It’s not about whether the 1 in 3 is correct, but what the sentence means. There is really no other way to read it.

    Stashiu3 (460dc1)

  85. I’ve read it, to me they’re saying that when you do a DNA data base search with the criteria in the puckett case there is a 33% chance of getting a false positive.

    Joe (c0e4f8)

  86. You’re right — that’s certainly how I read it.

    I tried for a more obscure/pedantic reading, where “in Puckett’s case” refers not to Puckett specifically but more generally to “a criminal investigation such as that involving Puckett”, but I ended up with the same semantic conclusion: that there’s a 1 in 3 chance that Puckett is innocent.

    As you said, I don’t know whether that’s true or not, but it is how I read the passage. ..bruce..

    bfwebster (2f56b4)

  87. I’ve read it, to me they’re saying that when you do a DNA data base search with the criteria in the puckett case there is a 33% chance of getting a false positive.

    Which is saying that there is a 33% chance that Puckett’s positive was a false one, meaning he’s innocent… correct? That’s how we’re reading it and what Patterico is saying. The LA times says that interpretation is wrong.

    Stashiu3 (460dc1)

  88. My interpretation is the same as yours.

    jim2 (f70403)

  89. Judging from the number of people who have argued statistics or logic in this thread, I can understand why the LA Times thinks it can get away with claiming the sentence doesn’t mean what it clearly means. There are obviously more people with reading comprehension problems than I realized.

    Fritz (4c04c1)

  90. by EW1(SG):

    “You’re right.” Whether sloppiness or what at the LAT, the article was intended to convey that the police state under Bush is railroading innocents on the basis of faulty evidence.

    Definitely sloppiness. A better analysis on their part would have led to an article that was more damning of the prosecution, not less. The existing article leads the reader to believe it is twice as likely that the proscutors framed a guilty man. In fact, we don’t know that. Depending on the odds that the killer was in the database, the real probability of Puckett’s guilt could either be much higher or much lower than 1 in 3. All it really is, is a greater unknown, with a jury convicting someone based on odds that the prosecution portrayed as 1 in 1.1 million and the defense portrayed as 1 in 3, but they were really 1 in “splunge.”

    Xrlq (62cad4)

  91. I wasn’t trying to argue the statistics, i was trying to use an example. did bad job of it i see.

    look, There’s a rate of false positive associated with the test. In this case that rate is 33%.

    It doesn’t mean that you are or are not innocent.

    It means that the chances you scored positively on the test are 33%.

    maybe it’s less clear when you know more about the case or something.

    [Joe, you are arguing statistics. You have a mental block, my friend, that makes you quite incapable of understanding the question I asked in the post. You just can’t being yourself to examine whether the paper’s actual statement is equivalent to my characterization of the statement; you feel compelled to jump to the question of whether the statement is accurate.

    Let’s try an analogy. The paper says: “Venezuela is one of several countries in North America. Countries in North America are all north of the equator.” If I ask: “Is the paper telling us that Venezuela is north of the equator?” you would insist on answering that “Venezuela is south of the equator” when the correct answer to MY QUESTION is “Yes.” Because I didn’t ask where Venezuela actually was. I asked where the paper SAID it was. Get it now? — P

    I have a feeling you will answer this analogy by saying: “But Venezuela IS south of the equator!” At that point I will cease trying to explain this to you. — P]

    Joe (c0e4f8)

  92. You are right. Hammer on…

    EHeavenlyGads (f29174)

  93. Which is saying that there is a 33% chance that Puckett’s positive was a false one, meaning he’s innocent… correct?

    No, that’s not an accurate statement. If the false positive rate were 99.9% and he did it than he’s guilty.
    If the false positive rate were 1*10^-10% and he didn’t do it than he’s innocent.

    Joe (c0e4f8)

  94. You are right.

    Jim (dde045)

  95. The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    No matter what the facts in the case are, the sentence is talking about hits on an innocent person. I know that a hit doesn’t prove innocence or guilt, I said as much in #52. We’re talking about how the sentence reads, and the following sentence that specifically relates it to a specific individual (Puckett). Whether they are right or wrong to make the 1 in 3 claim, they are saying that there is a 1 in 3 chance that Puckett was identified wrongly by the database search. Inherent in the way they framed it is that a wrong identification means that person is innocent. Yes, they’re wrong to say that a positive hit, or false-positive, or lack of hit means a person is innocent or guilty… that’s not relevant. What they are (rightly or wrongly) saying is the question. Patterico is right about what it means.

    Stashiu3 (460dc1)

  96. You’re right.

    But then again, English is my first language.

    huey (9558ff)

  97. I’ve slept on it a little bit.

    The problem revolves around the word “it.”

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    What is “it”?

    Cut and paste:

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, the probability that the database search had hit upon an innocent person was 1 in 3.

    Change “hit upon” to “matched”

    The statistic that leading scientists consider the most significant is the probability that the database search had matched an innocent person. In Puckett’s case, the probability that the database search had matched an innocent person was 1 in 3.

    The logical conclusions of the third and final paragraph are that 1) Puckett’s DNA matched that of the suspect 2) Puckett ultimately was innocent.

    A simpler version of teh problem is: “Frank and Jim were talking. He has cancer!” Now who has cancer-Frank or Jim? It’s ambiguous.

    MartyH (fd100c)

  98. No, that’s not an accurate statement. If the false positive rate were 99.9% and he did it than then he’s guilty.
    If the false positive rate were 1*10^-10% and he didn’t do it than then he’s innocent.

    Fixed it (using strikethrough). As I said in #52, I know that it doesn’t prove innocence or guilt, I’m using the terms they use in their sentence which is what the post is about.

    Stashiu3 (460dc1)

  99. You are right…there is only one rational way to read that statement.

    mjn1957 (6e1275)

  100. You are correct.

    Sue (f9a0a4)

  101. You are wrong. The terms “database search”, “hit”, “innocent person”, and “it” are so vague as to make the 1 in 3 comment worthless. The awkward wording makes it clear the article was deliberately written to shock by insinuating things not true, though it is technically accurate.

    [Wesson, the question is not whether the statement is accurate, or worthless, or any of that. The question is whether the paper’s statement is equivalent to what I said.]

    Wesson (785f2a)

  102. You’re right

    I am no statisician but I’ve had a couple college level classes in it.

    BK

    agesilaus (9923ce)

  103. Wesson,

    The awkward wording makes it clear the article was deliberately written to shock by insinuating things not true, though it is technically accurate.

    If it is technically accurate, what does it say? And what does it insinuate?

    Pablo (99243e)

  104. You are right.
    This whole affair brings to light the dangers of lawyers who only vaguely understand statistical math to sway jurors with an even foggier grasp towards a desired conclusion. At what point are citizens struck from a jury for demonstrating the ability to understand statistics?

    Amused Observer (0a663a)

  105. You are correct.

    This paragraph says there is a 33.3333% probability that Puckett is innocent.

    Lawyers, English professors, and journalists can parse and deconstruct to their heart’s content, but it is plain everyday American English and it means what it says.

    Iowan (c8e0d7)

  106. You are right.

    A partial match, whether DNA or license plate, may match multiple candidates. But fraction of the random or coincidental matches against the pool is independent from whether or not the target is in the pool. And multiple matches can be obtained, so we would expect 2 coincidental matches about 1/9th of the time, and 3 about 1/27 of the time and so on.

    I think this has to be taken as circumstantial evidence, but then that is one of the best kinds of evidence.

    gene smith (9ef2f9)

  107. It’s crystal clear. The paper stated that there is a 1 in 3 chance Pluckett was innocent. That’s what they said.

    But the paper also does not put much stock in words, or logic, or meaning. There is no right and wrong. Anything they say is by definition right, for them.

    There’s no sense arguing with one whose hands are over his ears.

    Amphipolis (e6b868)

  108. In a related story, the LAT said that respected scientists have determined that 1 in 3 readers of legal blogs not only have questionable math skills, they have additional problems with reading comprehension.

    *sigh*

    Stashiu3 (460dc1)

  109. #108 was not directed at Amphipolis (or anyone else who actually read and understood the post).

    Stashiu3 (460dc1)

  110. You’re right, I can only see one way to read what they wrote.

    BUT – I think they’ve made the beginner’s mistake in describing the usual statistic for this sort of thing. The correct way to describe that statistic is, “Assuming the DNA came from two different people, the probability of at least this strong a match happening by chance is 1/3.” This statistic says nothing about the chances of a real match occurring (i.e. guilt or innocence); it *assumes* they were different (innocent), and asks for the probability of getting a match this strong by chance. Is it possible to read what they wrote this way? Not by me, but I’ll bet the Times thinks it is.

    Wonka (0b049c)

  111. You’re right. Standing alone, the sentence misstates interpretation of DNA database “matches.”

    The preceding paragraphs mention protocol variables:

    Badly deteriorated DNA from the assailant’s sperm was linked to John Puckett, an obese, wheelchair-bound 70-year-old with a history of rape.

    The DNA “match” was based on fewer than half of the genetic markers typically used to connect someone to a crime, and there was no other physical evidence.

    Where is the article’s authority for supposing “the statistic that leading scientists consider the most significant” was the one profiled?

    steve (187be6)

  112. You’re right. Now post a new survey, substituting “would” for “had” in a fake LAT quote, and see if you can still get 50 comments agreeing that the fake quote says there’s a 1 in 3 chance Puckett is innocent. I’ll bet you can!

    I might get some commenters agreeing, but I have a pretty sharp bunch here, and I feel confident that a good number would see the distinction between describing the probability of expected results and describing the meaning of results that have already occurred.

    Check comment #2, for example.

    I will note also that the e-mails I received from Jason Felch justifying a failure to correct asserted 1) that my commenters disagreed with me on the basic issue of whether the 1 in 3 number in the article was accurate, and 2) specifically cited Xrlq and Daryl Herbert as supporting the LAT position.

    So it’s nice to see that a) the reporter admits that “formulation x” is statistically inaccurate, and b) there is unanimity among commenters who understand the question, INCLUDING Xrlq and Daryl Herbert (!!), agreeing that “formulation x” is what the paper actually said.

    If the paper is paying attention to this, and really does value accuracy over pride and a desire to wave aside annoying gadflies, they will correct this.

    Patterico (cb443b)

  113. You’re Right. As I have understood English for the last 70 years.

    Buckshot (62b062)

  114. But Patterico, notice how few of your commenters are even zeroing in on “would” versus “had” or past versus future.

    Now, in terms of whether or not you’re right, that doesn’t matter, of course. They could all disagree with you and you could still be right. But you’re making a case (I think) that because the overwhelming number of commenters (supposedly) see it the same way you do, the sentence is therefore patently wrong as understood by the lay reader.

    What *I’m* saying, though, in my — you guessed it — not-entirely-rhetorical suggestion for a new survey is that you could fix the sentence and still get near 100% agreement that the sentence says something it *doesn’t* say.

    Again, I do realize that this doesn’t undercut your basic premise: that you’re right. But I do think it seriously damages this particular thread as support of your point.

    Not Rhetorical (5f6368)

  115. You’re right. Standing alone, the sentence misrepresents interpretation of DNA database “matches.”

    The preceding paragraphs mention protocol variables:

    Badly deteriorated DNA from the assailant’s sperm was linked to John Puckett, an obese, wheelchair-bound 70-year-old with a history of rape.

    The DNA “match” was based on fewer than half of the genetic markers typically used to connect someone to a crime, and there was no other physical evidence.

    Where is the article’s authority for supposing that “the statistic that leading scientists consider the most significant” was the one profiled?

    steve (187be6)

  116. Let me go back to parsing the original sentence, though. And let me admit that *I* don’t really understand how it can be that there’s a “distinction between describing the probability of expected results and describing the meaning of results that have already occurred.” I think it’s the same thing as Schroedinger’s cat, and that just ties my brain in knots.

    “The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.”

    I think that to fix the above sentence, you’d have to substitute “would hit upon” for “had hit upon” AND substitute “In Puckett’s case, it would have been 1 in 3” for “it was 1 in 3.”

    But here’s where I think there’s a true statistical (not just grammatical) question that was possibly eluding the reporter and certainly eludes me: Considering that the case *has* already occurred, why *can’t* it be expressed simply in the past tense instead of in whatever tense you’d call it when you’re in the past looking at the future? (I did a little research and I think we’re talking about the conditional past perfect tense, but I’m no grammarian.)

    Not Rhetorical (5f6368)

  117. You are right. You can also read it to mean that there is a 2 in 3 chance that he is the right match.

    BT (78b929)

  118. you are right

    whole buncha “if’s” are needed to get to 1 in 3

    SteveG (71dc6f)

  119. Combining the points I try to make in #113 and #115, here’s what I’m driving at:

    I believe the error the reporter made was in using the past tense rather than the (conditional past perfect?) tense.

    I believe that even if the sentence were fixed, most readers would still conclude that the LAT was saying there was a 1-in-3 chance Puckett was innocent — a conclusion that I think you, the reporter and I agree would be (and is) wrong.

    I believe that *because* most readers would still think that, and *because* most of the words devoted to disproving the statement have not zeroed in specifically and explicitly on the tense problem, the reporter still doesn’t *realize* the mistake he has made and still thinks you are among the legions that, I posit, will still and always make the wrong logical leap.

    I don’t mean to say “legions” arrogantly, by the way; I include myself among them. Or at least, I include myself among those who would be *inclined* to make the leap and really can’t quite understand why I *can’t* make the leap, even though I take it on faith (for lack of a better word) that I’m not allowed to make that leap.

    So: As I’m sure you won’t be surprised to learn, I don’t think you’re right to pillory the LAT or the reporter on this. Or at least not to be so angry about it! :o) I don’t think they’re being malicious. I agree that it’s an important point you’re making, but I don’t think you’ve yet hit on the best, clearest, most concise way to make it, and I think it therefore continues to elude the reporter. And the more times your argument is made imprecisely (and at length), the less they’re likely to be listening. That’s not really defensible on their part, but it is human nature.

    And all *that* goes back to my fundamental belief, stated to you before: I don’t think anyone is giving anyone the benefit of the doubt, and that sucks. It pains me, in both directions. I suspect (but don’t know!) that the LAT — though it should be above this — is not being as patient as it should be, because it knows you’re an ever-persistent gadfly who sort of hates it, and stirs up people to hate it more. And I think you’re assuming this is yet more evidence of a supposed agenda on the LAT’s part, rather than an honest misunderstanding.

    Whew. Now my head is killing me. Thanks a lot! I prefer not to think so hard on *weekends*, Patterico, you jerk. :o)

    Not Rhetorical (5f6368)

  120. You’re right, of course.

    Bob (f133bd)

  121. I agree that it’s an important point you’re making, but I don’t think you’ve yet hit on the best, clearest, most concise way to make it, and I think it therefore continues to elude the reporter.

    Well, at the very least, they could have said “would” instead of “had.”

    That’s pretty concise.

    Now, you’re correct that doing only that could still leave a misleading impression among those who considered the matter superficially.

    So if we want to move beyond bare accuracy to actually informing readers — which a large part of what I have been interested in doing all along — then yes, it takes more words to explain the concept.

    The best way to express this would have been this:

    The statistic that leading scientists consider the most significant is the probability that a search of a database of innocent individuals would hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    That adds a critical assumption: that the database is of innocent people.

    If you don’t include that assumption, then you can still talk about the chances that a database search WOULD hit on an innocent person — but you’re leaving out the chances that it would hit on a guilty person.

    That would be potentially misleading in a case where you have one and only one hit.

    But it would at least be accurate.

    The paper should, at a minimum, correct the “had” to “would.” The fact that superficial readers might still be confused is not a reason to leave an inaccurate assertion standing.

    Why am I so mad? Because the paper splashed a major statistical fallacy across the front pages of its Sunday edition, and won’t correct it. That’s why I’m angry.

    LAT: Issue the correction. Change “had” to “would.” Be accurate.

    Concise enough?

    Patterico (cb443b)

  122. I suspect (but don’t know!) that the LAT — though it should be above this — is not being as patient as it should be, because it knows you’re an ever-persistent gadfly who sort of hates it, and stirs up people to hate it more. And I think you’re assuming this is yet more evidence of a supposed agenda on the LAT’s part, rather than an honest misunderstanding.

    Actually, I strongly suspect (and have all along) that the former scenario is what we have.

    Not Rhetorical, how’s about *you* give *me* the benefit of the doubt? Rather than assuming that I am using this as an example of agenda reporting, give me the benefit of the doubt and assume that I am merely frustrated that I am making an accurate point and getting brushed off.

    Patterico (cb443b)

  123. I’ll add this: there’s another far more subtle statistical issue that I initially put front and center in my first e-mail: the idea that larger databases increase the reliability of results, rather than decrease them.

    The response to that complaint ends up being that there is a lot of debate on the issue among knowledgeable people, and those who take the LAT side of the debate (which the article portrayed as the only valid view, although it is not) recognize a paradox but essentially brush it off as unimportant in the usual case involving smaller databases.

    But that was the first complaint I raised. It’s a bigger-picture issue, and it’s the one that more truly interests the statistics experts I contacted.

    It has taken a lot of discussion to isolate the three errors I ultimately concluded were beyond debate, that I have discussed in more recent posts. The response to those three errors has been unsatisfactory in that it doesn’t appear to grasp the issues, mis-cited commenters of mine as supporting the paper’s position, and used a little tu quoque (because my language had not always been 100% precise in my posts and comments, the response was essentially that sure, there might be a little ambiguity here and there, but ambiguity is not error, and we all do it).

    My point with this post was to show that there is no ambiguity in how you interpret this language.

    And we haven’t even begun to discuss the 1 in 1.1 million prosecutor’s fallacy error, which is a separate and clear error.

    Patterico (cb443b)

  124. Again, I do realize that this doesn’t undercut your basic premise: that you’re right. But I do think it seriously damages this particular thread as support of your point.

    Not Rhetorical, I think you and I are making the same points:

    1) The article was inaccurate. It should be corrected and a simple verb tense correction would yield accuracy.

    2) Making that correction would still mislead a lot of readers. Not as many, but a lot.

    My first response is that #2 is not a reason to refuse to address #1. Inaccurate is inaccurate.

    Fixing #2 is a relatively simple matter, as I have stated: include the assumption that the database includes only innocent people, and talk about the odds of nevertheless getting a hit.

    Patterico (cb443b)

  125. You are right.

    For what it is worth, I tried to come up with an alternate explanation and couldn’t thnink of one that made any sense.

    Stu707 (adbb5a)

  126. And, Not Rhetorical, there is a critical problem that arises when you leave out the assumption that the database is innocent. If you say that, as the database increases, there is an ever larger chance that any hit you get will be a hit to an innocent person, the logical result is absurd. By that logic, if you use a database of all DNA in the world, the reliability of any result is infinitely small. In reality, the reliability of a single hit from a completely comprehensive database would be (absent human error) infinitely great.

    This was the original point I made to Mr. Felch, and he acknowledged the paradox but essentially brushed it off as an anomaly that has no relevance to ordinary cases.

    But if you’re careful about how you express the conclusions — and you include my critical assumption that the database is composed only if innocent people — then the paradox resolves itself.

    For example:

    You have a comprehensive worldwide database and your random match probability is 1 in 6 billion. You get one hit. Using the LAT terminology, you now have a 6B/6B chance (100% certainty) that the search had hit — or even would hit! — an innocent person.

    But that’s wrong. You obviously have, absent human error, certainty that the search would/had hit a GUILTY person.

    Bottom line: It’s critical to explain that the adjustment recommended by the committees describes the chances of a hit to an innocent person, in a database of innocent unrelated people. And it gets more critical as the database grows.

    But yeah, that takes a lot of words to explain. So for now, I’m harping on the simple accuracy point, because I can express that easily. Change “had” to “would.”

    Of course, that still leaves another prosecutor’s fallacy in the article uncorrected.

    Patterico (cb443b)

  127. You’re wrong.

    Not about the LA Times. About life. The stuff that matters.

    Because all I learned in this thread – since I already knew the Times employed nothing but hacks – is that you are a bit of narcissistic, pedantic, obsessive-compulsive weenie.

    Did you really need to fill up a 125 comments with the blitheringly obvious?

    You’re right. You win. All your commenters are stupid (if they don’t agree with you). We get it.

    Don’t you have anything better to do?

    P (df860b)

  128. The verb tense problem also magnifies as the database size increases.

    If you say that, as the database increases, there is an ever larger chance that any hit you get will be a hit to an innocent person, you sound sillier and sillier as the database size approaches the size of the world population.

    But if you say that, as the database increases, there is an ever larger chance that you will get a hit to an innocent person, that is correct.

    Patterico (cb443b)

  129. Professor Blather (who is hiding behind the pseudonym “P”):

    This discussion goes straight to the accuracy of the central statistic on the front page of the L.A. Times’s Sunday newspaper.

    Now, it may be that the paper, by being stubborn, hopes that I will either a) let the issue go, or b) make myself appear to be obsessive by refusing to let it go. That’s a good rhetorical strategy — *if* you don’t actually care about accuracy. I don’t know if that’s what they’re doing, but if it is, then it’s working on people like you.

    If you’re not interested in this, then feel free to toddle off. I don’t really care whether you read my site and I never have. You have a certain history — which is maybe why you’re not commenting under your usual pseudonym — and you can take that baggage elsewhere if we’re boring you.

    But if you think this is unimportant, you’re just wrong. Flatly wrong. If it were, the paper wouldn’t have given it the prominence it did.

    Patterico (cb443b)

  130. Patterico, is there any other possible answer than your chosen answer? If you already know the answer, whats the use of the exercise. Since those who say you are wrong are wrong as far as you are concerned, then I guess the answer should be, Yes, you are right. See?

    love2008 (5189d2)

  131. You are right.

    Most discussions on this blog are well thought out. You always have the extremes.

    These comments were sought to get a poll of how the readers saw the issue. The right or wrong at first gives us a chance to count the views. Also it gives us a chance to focus on the wrong views and see if there may be something that we missed.

    My view on the LAT and math is that most of the writers use a calculator to figure out how much change to put in a vending machine.

    EDP (7223ad)

  132. You’re right!

    But you’re also gettin’ a little Don Quijote-ish if you think editors at the Times are going to be intellectually honest enough to acknowledge that — at best — their choice of words is horribly misleading. They’re much happier “correcting” trivial errors, so they can delude themselves (and others?) into thinking that the L.A. Times cares about a clear presentation of the facts.

    Sancho Panza (d0a38f)

  133. Patterico, is there any other possible answer than your chosen answer? If you already know the answer, whats the use of the exercise. Since those who say you are wrong are wrong as far as you are concerned, then I guess the answer should be, Yes, you are right. See?

    If you have an answer that’s consistent with the language of the article, let’s hear it and we’ll discuss it. That would be a lot more helpful than some snide remark that doesn’t suggest a plausible alternate reading.

    Patterico (298d8a)

  134. You’re right.

    Bill M (6a185f)

  135. #132
    Okay Patty, for what its worth, its very obvious that you are right.
    My assertion: the paper is saying there was a 1 in 3 chance that Puckett was innocent.

    The L.A. Times says I’m wrong. They say there is another way to read that language — but they won’t tell me what it is. (Yes, I asked.)
    There is no other way to read it except the way it is. Except of course someone wants to misinterpret it for some disingenuous reasons. You are absolutely correct! Didnt intend my comment #129 to rub off as supercilious. Far be it from me.

    love2008 (5189d2)

  136. You are correct, that is the only way that staement can be read.

    pet campbell (576609)

  137. You’re right, inasmuch as it can be interpreted that way. My only qualm would be that the author may’ve intended something different, but their intent’s lost in the ambiguity of the antecedent of “it.” I would argue, like you are, that “it” refers back to the final clause of the previous sentence:

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    What “was 1 in 3”? “The statistic that leading scientists consider the most significant” is not “1 in 3.” That would make no sense. So it’d have to be “the probability that the database search had hit upon an innocent person.” The only other alternative is “the probability,” but since it’s modified by everything following “that,” that’s a difference with no distinction.

    The only thing it can mean the way it’s written is:

    The probability that the database search had hit upon an innocent person […] was 1 in 3.

    Barring, that is, any unusual circumstances in Puckett’s case of the sort we discussed last night, which isn’t what you want to discuss here.

    And before I forget:

    We need to wait for SEK to visit to tell us how to correctly interpret the information. That’s my story and I’m sticking to it.

    daleyrocks, last night the universe delivered the most righteous of bitch-slaps square to your face, but that’s something I’ll take up later.

    SEK (bd295a)

  138. your right. And I really worry about those who either didn’t understand the question or think that you’re wrong.

    Those people shouldn’t be allowed to vote or have pets, let alone children.

    Jack (d9cbc5)

  139. You’re right. There is an imponderable chance that the search hit upon an innocent man because we have NO idea about the likelihood of a given sexual predator being in the database.

    Kevin Murphy (805c5b)

  140. You’re right.

    I only skimmed a couple of the other comments, and am probably echoing someone else, but, hey, you asked. 🙂

    The first sentence cites a statistic: “the probability that the database search had hit upon an innocent person.” This much is self-evident.

    The second sentence reads “In Puckett’s case, it was 1 in 3.” (my emphasis) The next question is, what does “it” refer to?

    “It” relates to the phrase “1 in 3.” The quantity “1 in 3” refers to a specific probability. The previous sentence mentioned a specific probability. Hence, “it” refers to the probability cited in the first sentence.

    That was the probability “that the database search had hit upon an innocent person.”

    QED

    Casey (9ee427)

  141. Or, alternatively, you’re right. There are no innocents in the database.

    Kevin Murphy (805c5b)

  142. You’re right, of course — and I can’t anything relevant not already mentioned in the 140 or so comments above. What I’d like to know is, who at the Times declined to “clarify” the intended meaning of the statement? I thought transparency by newspapers was the latest objective they’re touting these days. Was the uncooperative culprit the readers representative, an assistant editor, the reporter? Inquiring minds need to know.

    james fulton (86d115)

  143. No wonder the people at the LA Times are loosing subscribers. For whatever reason, what they print in the paper isn’t what they mean.

    You’d think that printing or saying what you mean would be a first principle in that business.

    Jack (d9cbc5)

  144. You’re right.

    brobin (c07c20)

  145. j curtis, you just said the same thing Patterico does about that sentence, which means you agree with how he reads it.

    No. I didn’t specify Puckett, I specified “the return”.

    The sophistry is in the words “leading scientists”. There would be qualifiers for that number. Even the factor of determining that the return is still living and not deceased, that number would change. It changes big time if you learn that the return lives within a certain miles radius. That 1 of 3 they talk about could mean feeding the dna into a national database and getting one return without any consideration of locale or any other factor. If it was just a state database and not national, that narrows it down that much more. That ratio changes in a hurry as soon as you learn anything about the return.

    So, you “could” read that quote another way, although that other way would be very twisted with sophistry.

    j curtis (c84b9e)

  146. No. I didn’t specify Puckett, I specified “the return”.

    But the return was to Puckett. That’s the point. Given the facts of the case, there’s no other way to read the quote.

    So, you “could” read that quote another way, although that other way would be very twisted with sophistry.

    And it would ignore the plain English meaning of the phrase.

    Patterico (cb443b)

  147. You’re right. There’s no ambiguity – and the sophistry arguments don’t hold. Doesn’t matter what or who the “leading scientists” are, the meaning of what the LAT wrote is that Puckett is 33% “innocent”, and the implication is “not guilty” under the law.”

    Bill (9179a5)

  148. You are right. It’s not a bad way to say the evidence shows there is a 1 in 3 chance that he is innocent.

    Steve Scheiwe (d671ab)

  149. “in Puckett’s case”…

    “Puckett’s case” could refer to “the case”. At some point in the case, you have the “hit” but you haven’t read the name and you know nothing about the individual who was singled out by the computer.

    “in the case with this singular database hit, which was subsequently determined to be Mr. Puckett….”

    If it was written that way, would there be a problem? Again though, it would still be poorly expressed because the criteria that determines the 1 of 3 isn’t qualified within the article.

    You can see now why all these old dna cases are being overturned. It is easy to trick juries stacked with your typical citizens and it’s not apparent that the prosecutors have the ability to point these tricks out.

    j curtis (c84b9e)

  150. Not understanding you, j curtis. Sorry.

    Patterico (cb443b)

  151. You are right.

    FrancisT (aeeaf5)

  152. You are right.

    151 comments, many erudite and interesting, but none with a trace of a plausible suggestion that you are other-than-right on this issue.

    By the way, Comment #126 discusses the odds of a random hit in the case of the complete database of all 6 billion innocent people, positing that this procedure has a 1-in-6-billion chance of finding a mistaken match. This distribution would be governed by the Poisson distribution. For a set of runs of this database against a series of suspect DNAs, there would be on average one hit per run. Sometimes there would be no hit, often one hit, occasionally two hits, rarely three hits, very rarely four hits.

    Of course, if the database were truly complete and thus contained the guilty party’s DNA, these hit-per-run numbers would be all be incremented up by one (as the match with the guilty DNA would count for one hit in each run).

    Of course, too, this is way too subtle for a discussion that must center on the proper use of double-secret antecedents.

    AMac (aa8819)

  153. AMac, keep in mind that in comment 126 I was merely discussing the statistical measures advocated by the LAT, and not the measures I believe are appropriate.

    Patterico (cb443b)

  154. Understood, Patterico @ 153.

    AMac (1bee00)

  155. YOu are right.
    The statistic is…the probability that the database search had hit upon an innocent person.

    In Puckett’s case, 1 in 3. [that the database search had hit upon an innocent person.]

    MikeNil (4b0c7b)

  156. Oh God, more demonstration that no one, particularly the media, understands Bayesian statistics.

    Your assertion seem correct; but there is more to the issue than that. (Isn’t there always…)

    Needed:
    1. Specificity of DNA testing – a criteria of number of alleles tested, etc – ie what does (+) mean for this test – the number of allele matches required selects for the specificity of the test – the “true positives”.
    2. Sensitivity of DNA testing – similarly, the extent of matching required selects for the sensitivity of the match – “true negatives”.
    Note: the greater selectivity you place on one part, the more you may harm the other – If, for instance, one selects to NEVER miss a positive test or match, the greater the risk of false positive tests. And thus, less overall accuracy.
    In analysis of diagnostic tests in medicine, the selection of just what (+) means AND the actual prevalence of the disease state (just how much is floating around) is critical to the analysis.

    In DNA analysis for forensics, the key would be the breadth of the denominator – database size and source, and the definition of “positive” as it effects same. Hence, the accuracy of the test is a function of decisions made.

    From the statement, it would seem that they are describing the specificity of the test, which would indicate that they calculate that 1 of 3 tests noted as positive would in fact be a false positive finding – not even very good if you are looking to eliminate a chunk of the pool and then proceed to further screen – but absurdly high to even consider charges let alone conviction without a lot of evidence. Numbers like that barely allow the tests to be good exclusion tests, let alone inclusive – “guilty” tests.
    Missing from the statements is any data as to what percentage of “negative matches” are in fact false negative. You need both sides of the analysis to really KNOW what is going on.

    John Gregg, MD (71b4bf)

  157. Dr. Gregg, unfortunately the post is not about statistics or DNA. It’s about English and sentence structure, so further analysis is unnecessary. Does the sentence, correctly or incorrectly, make the claim that there is a 1 in 3 chance that Puckett was innocent? To most of us, it clearly does. If you don’t think so, how else can that be read? Give a plausible alternative in plain English.

    Stashiu3 (460dc1)

  158. They had no choice as to specifity. The sample was degraded. They would have wanted thirteen indicators (alleles?)* but they could only get five and a half.

    *Is there such a thing as half an allele?

    nk (6c75e0)

  159. You are right but why do so many of your correspondents have trouble commenting on the question you pose and veer off onto other issues. Still, I think a blogger can often be judged by the quality of the comments on the blog and you and yours are one of the best.
    JO

    J Ozols (e470bc)

  160. To add to my “You are right” above.

    LA times sentences

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    This can be rewritten as follows by replacing the pronoun it with what it refers to.

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, the probability that the database search had hit upon an innocent person was 1 in 3.

    and then we can join the two sentences by expanding the replacement with the adjectival which clause

    In Puckett’s case, the probability that the database search had hit upon an innocent person, which is the statistic that leading scientists consider the most significant, was 1 in 3.

    And then we can move the “In Puckett’s case” to th end of the sentence.

    The probability that the database search had hit upon an innocent person, which is the statistic that leading scientists consider the most significant, was 1 in 3 in Puckett’s case.

    If one argued that the original statement did not mean that Puckett had a 1 in 3 chance of being innocent then that same argument should apply to my 3rd rewrite above since all I have done in the rewriting is move dependant clauses around. That 3rd statement has an adjectival clause (whichis the statistic that leading scientists consider the most significant) that can be removed without changng the meaning of the rest of the sentence. Hence the sentence simplifies to

    The probability that the database search had hit upon an innocent person was 1 in 3 in Puckett’s case.

    The difference between that statement and Patterico’s claim

    there was a 1 in 3 chance that Puckett was innocent

    lies purely in the question of whether it is reasonable to assume there is a 1:1 correspondence between actual innocence and ‘the database search hitting upon an innocent’. Given what I recall of the surrounding text that seems like a reasonable assumption in the original context.

    As I see it the only way the LAT can weasel their way out of this is a Clintonian quibble on the meaning of the word ‘it’ in the original text. I would love to see how they redefine ‘it’ and whether they can find any competent grammarian to agree with their redefinition.

    FrancisT (aeeaf5)

  161. You’re wrong.

    Actually, I don’t care whether you are right or wrong, on this, but since I had to pick one, I thought I’d be contrary and at least get people to look at my post.

    [Well, then, please don’t say I’m wrong if you really don’t believe that to be true, just because you’re generally hostile to my positions. — P]

    The only way I can come up with to give the L.A. Times the benefit of the doubt is as follows:

    Given the same set of parameters as the Puckett case, the chances of there being a false positive, is 1 in X.

    [That presumes the article had used the conditional “would” instead of the pluperfect “had.” Look at the article’s actual language. — P]

    In other words, irrespective of Puckett, his actual guilt or innocence, if we took those same numbers we’d get a “hit” with some probably, Y. From that we can then go on to calculate odds, and so forth. The aritcle is sloppily written, IMO, but not surprising when a non-expert is writing about statistics/probability theory. As the Monty Hall problem shows, it is a subject area that often gives up non-intuitive answers.

    [When you say “we’d get a hit” your contraction is short for we WOULD get a hit. But the article said “had” and not “would.” — P]

    Steve Verdon (94c667)

  162. Steve Verdon, I have responded to you within your comment, but the bottom line is, your response ignores the actual language of the article, which discusses the chances that the search “had hit” on an innocent person.

    Patterico (cb443b)

  163. Maybe you should post a new entry asking the Steve Verdons of the world if there is a difference between the past and future tenses.

    Xrlq (b71926)

  164. Pat, I think I have found the “weasel out” interpretation they may be alluding to, but I will have to test it here.

    The statistic that leading scientists consider the most significant is the probability that the database search had hit upon an innocent person. In Puckett’s case, it was 1 in 3.

    Pat’s assertion: the paper is saying there was a 1 in 3 chance that Puckett was innocent.

    I outlined what I believe is the only reasonable reading of this in #68 and I wholly agree with Pat that his interpretation is what is intended.

    Here is, however, what I believe the LATimes is the weasel interpretation that allows them the very slight wiggle room to suggest that they “unintentionally” wrote it in a way that could be intepreted in the way virtually everyone here took it.

    The statistic that leading scientists consider the most significant is the PROBABILITY that the database had hit upon an innocent person.

    That PROBABILITY in [these set of facts] was 1 in 3.

    I’ll say it slightly differently, just to highlight the supposed “intent” in the weasel words.

    Blah, blah, blah…leading scientists…blah, blah, blah…utlizing 5.5 instead of 13…. leads to a PROBABILITY of a false positive in a sample size like this GIVES US a MATHEMATICAL ratio of 33%.

    The “in Puckett’s case” language…is a shortcut for (utilizing 5.5 markers/given this sample size.)

    The “had hit upon an innocent person”…is a shortcut for “had returned a false positive”. (the false positive meaning…someone who had 5.5 but who would NOT have all 13 markers in common)

    So, if they were intending to be accurate and intending to be understood properly, they would have written the sentence as follows:

    The PROBABILITY statistic that leading scientists might find troublesome is that utilizing a random database of this size and only having 5.5 markers would produce a statistical PROBABILITY of a false positive 33% of the time. That is, in a random database of this size, you could expect someone to match 5.5 markers, but not all 13 approximately in one third of those databases.

    Instead, they INTENTIONALLY said, (paraphrasing) THE MOST SIGNIFICANT PROBABILITY IS THAT THE DATABASE HAD HIT UPON AN INNOCENT PERSON….IN PUCKETT’S CASE, IT WAS 1 IN 3.

    There is no need to define the meaning of “it”. It refers to the inevitable outcome of hitting on AN INNOCENT PERSON.

    The “in Puckett’s case” simply peronalizes the “innocence” of Puckett himself. The reasonable interpretation is the only one expected and I don’t believe the weasel version is reasonable.

    cfbleachers (4040c7)

  165. Cfbleachers, you are missing the point of this exercise. It’s not about equating genuine DNA matches with guilt or apparent ones with innocence. It’s about confusing the original odds that a given event would occur with the after-the-fact odds that it did occur in any particular case.

    Your non-point about the “leftist media [having] this game down pat” is not taken. The L.A. Times is as knee-jerk liberal as the day is long, but they have nothing to gain ideologically by making the case against Puckett look more solid than it really is.

    Xrlq (b71926)

  166. XRLQ

    I thought the exercise was to determine what the reasonable interpretation of the sentence structure was…and, simultaneously, whether that intended interpretation was intentional.

    I conclude that Pat’s interpretation is the only reasonable one and that is WAS intentional.

    The sentence structure utilized INTENDS to convey the notion that Puckett was innocent AND that such conviction of an INNOCENT PERSON…was not only highly probable…but inevitable.

    This is completely consistent with the continuing narrative by the leftist media, including the LATimes (perhaps especially the LATimes)…that the police and prosecutors are “out to get” a certain class of people and wrongfully arrest and prosecute them as criminals.

    ie, “As in Puckett’s case…1 in 3 is INNOCENT”.

    The fact remains, we don’t know that Puckett is innocent, nor do we know that the DNA testing returned a false positive as it relates to him or “his case”.

    The use of the term “innocent” when combined with the phrase “as in Puckett’s case”…is an intended “connective phrasing”.

    That doesn’t make the prosecution’s case look stronger, it makes it look oppressive. Which is and was the intention all along.

    cfbleachers (4040c7)

  167. Stating that there is a 1 in 3 chance of Puckett’s innocence makes it sound as though he’s twice as likely to be guilty as he is to be innocent. That makes the prosecution’s case sound stronger than it is. In reality, we don’t know that Puckett is any more likely to be guilty than innocent. For all you or I know, the odds could be 2:1 the other way, or worse.

    Xrlq (b71926)

  168. Stating that there is a 1 in 3 chance of Puckett’s innocence makes it sound as though he’s twice as likely to be guilty as he is to be innocent. That makes the prosecution’s case sound stronger than it is. In reality, we don’t know that Puckett is any more likely to be guilty than innocent. For all you or I know, the odds could be 2:1 the other way, or worse

    Stating that prosecutors were convicting suspects who statistically had a one in three chance of being a wrongfully convicted INNOCENT party…is not suggesting that the other two options are guilty, XRLQ. They ALL could be “innocent”…but the one in three would be convicted, wrongfully.

    To insert INNOCENCE into the statistical probabilities, is to suggest that one in three INNOCENT PERSONS would get “fingered” wrongfully for prosecution. It does NOT suggest that the other two…are innocent.

    Three people are put into a lineup. One person has five and a half similar attributes to the physical description of the perp. There are 13 total physical descriptions. The one that matches 5.5 of them is wrongfully prosecuted, he is “innocent”. The other two are let go. They are also treated as “innocent”.

    The 1 in 3 chance that Puckett is “innocent” doesn’t mean or even sound like he is twice as likely to be guilty. They are saying you can’t PROVE GUILT with 5.5 markers…only probability of innocence….WHICH IS TOO HIGH TO BRING THE CASE AND ON ITS FACE CONSTITUTES REASONABLE DOUBT.

    cfbleachers (4040c7)

  169. Simple logic dictates that a 1 in 3 chance of any given proposition P implies a 2 in 3 chance of ~P. Therefore, a 1 in 3 probability that a database search will return a random match to someone unrelated to the true donor entails a 2 in 3 chance that it will not. None of that translates into a 1 in 3 or 2 in 3 chance that any particular individual is or isn’t the true donor. That was the fundamental mistake made by both the prosecution and the defense in Puckett’s case. Whether it’s the garbage figure the prosecution requested (1 in 1.1 million) or the accurate figure the defense requested (1 in 3), neither tells us anything about the likelihood of a true match occurring.

    The 1 in 3 chance that Puckett is “innocent” doesn’t mean or even sound like he is twice as likely to be guilty.

    I’ll grant you that to the logic-challenged it may not sound like that, but it most certainly does mean it. And unless you can show that the killer’s odds of being in the database were better than 50-50, the odds that Puckett wrongly matched may well be 1 in 3, or worse.

    They are saying you can’t PROVE GUILT with 5.5 markers…only probability of innocence…WHICH IS TOO HIGH TO BRING THE CASE AND ON ITS FACE CONSTITUTES REASONABLE DOUBT.

    Which is true, you can’t, unless you have a dishonest prosecutor to the jury that those 1 in 3 odds are really 1 in 1.1 million, and an innumerate judge who lets him get away with it. Oh, wait…

    More importantly, once you get past the prosecutor’s fallacy, if you ever make it past there, you’ll see that there is no reason to think that the real odds of Puckett’s innocence, based on DNA alone, are any more favorable than the 1 in 3 figure erroneously cited by the Times. All there is, is a bigger unknown, which makes the prosecution’s case look worse still. Which would you rather see, juries who fudge reasonable doubt and convict people who are 2:1 more likely to be guilty rather than innocent, or juries who ignore probabilities entirely and convict people for whom they have no idea how likely they are to be guilty?

    Xrlq (b71926)


Powered by WordPress.

Page loaded in: 0.1620 secs.