Patterico's Pontifications

7/21/2008

The Power of the Jump™: Shockingly Unexpected DNA Results Are “Indeed” Expected!

Filed under: Crime,Dog Trainer,General — Patterico @ 12:24 am

(Note: “The Power of the Jump”™ is a semi-regular feature of this site, documenting examples of the Los Angeles Times’s use of its back pages to hide information that its editors don’t want you to see.)

Above the fold in their Sunday edition, the L.A. Times asks in its lead story: “How reliable is DNA in identifying suspects?” The deck headline: “A discovery leads to questions about whether the odds of people sharing genetic profiles are sometimes higher than portrayed. Calling the finding meaningless, the FBI has sought to block such inquiry.”

The story is by our old friends Jason Felch and Maura Dolan, who so badly botched the statistics in this area before.

Here’s their lede:

State crime lab analyst Kathryn Troyer was running tests on Arizona’s DNA database when she stumbled across two felons with remarkably similar genetic profiles.

The men matched at nine of the 13 locations on chromosomes, or loci, commonly used to distinguish people.

The FBI estimated the odds of unrelated people sharing those genetic markers to be as remote as 1 in 113 billion. But the mug shots of the two felons suggested that they were not related: One was black, the other white.

In the years after her 2001 discovery, Troyer found dozens of similar matches — each seeming to defy impossible odds.

Wow! So the matches Troyer found were statistically unexpected, right?

Mmmm . . . not so much. Here’s the crucial passage, which was buried at page A20:

Indeed, experts generally agree that most — but not all — of the Arizona matches were to be expected statistically because of the unusual way Troyer searched for them.

Indeed!

In a typical criminal case, investigators look for matches to a specific profile. But the Arizona search looked for any matches among all the thousands of profiles in the database, greatly increasing the odds of finding them.

Well, there you have it! And you have only to turn to Page A20 to learn this! And, of course, every loyal L.A. Times reader does exactly that!

Just ask around! You don’t know anyone who just scans the headlines and the front page, do you?

What’s that? That describes most people you know?

Well . . . . that’s OK, I guess. I’m sure none of those people will end up as jurors on our DNA cases . . .

UPDATE: David Kaye has an excellent post on this that helps put it all in context.

46 Responses to “The Power of the Jump™: Shockingly Unexpected DNA Results Are “Indeed” Expected!”

  1. This isn’t really worth inclusion in the main post. But still:

    Encouraged, Barlow subpoenaed a new search of the Arizona database. Among about 65,000 felons, there were 122 pairs that matched at nine of 13 loci. Twenty pairs matched at 10 loci. One matched at 11 and one at 12, though both later proved to belong to relatives.

    So. Matches at 11 and 12 loci proved to belong to relatives. And the statistics given ALWAYS relate to an assumption of a pool of UNRELATED people.

    So: what about the twenty pairs matching at 10 loci? We’re not told if there are any related people there. Same for the even larger number of matches at 9 loci.

    But I’m sure that’s all irrelevant!

    Patterico (cb443b)

  2. More:

    We’re told in the article that criminologists “scrambled to explain” the results. Look at the results of all that DESPERATE SCRAMBLING!!!

    Law enforcement experts scrambled to explain.

    Three months after the court-ordered search in Arizona, Steven Myers, a senior DNA analyst at the California Department of Justice, gave a presentation to the Assn. of California Crime Lab Analysts. It was titled “Don’t Panic” — a hint at the alarm Troyer’s discovery had set off.

    Many of the Arizona matches were predictable, Myers said, given the type of search Troyer had conducted.

    In a database search for a criminal case, a crime scene sample would have been compared to every profile in the database — about 65,000 comparisons. But Troyer compared all 65,000 profiles in Arizona’s database to each other, resulting in about 2 billion comparisons. Each comparison made it more likely she would find a match.

    When this “database effect” was considered, about 100 of the 144 matches Troyer had found were to be expected statistically, Myers found.

    Troyer’s search also looked for matches at any of 13 genetic locations, while in a real criminal case the analyst would look for a particular profile — making a match far less likely.

    Further, any nonmatching markers would immediately rule out a suspect. In the case of the black and white men who matched at nine loci, the four loci that differed — if available from crime scene evidence — would have ensured that the wrong man was not implicated.

    The presence of relatives in the database could also account for some of Troyer’s findings, the FBI and other experts say. Whether that’s the case would require cumbersome research because the databases don’t contain identifying information, they say.

    (Emphasis in original.)

    Meh. A buncha wonkery that undercuts the main point: THEM PROSECUTORS AND THEIR SCIENCE CAN’T BE TRUSTED!!!!!1!!!eleven!!!!!one!!!

    Patterico (cb443b)

  3. No one knows precisely how rare DNA profiles are. The odds presented in court are the FBI’s best estimates.

    Actually they are like fingerprints, unique (when they take other ‘loci’ into consideration if more than one match is found)

    In one case, Callaghan advised state officials to raise the risk of expulsion with a judge, then told the officials that expulsion was unlikely to happen, according to a record of the conversation filed in court.

    In an interview with The Times, Callaghan denied any effort to mislead the court.

    Notice how the unsubstantiated claim is given and then the ‘interview’ given smaller weight.

    Troyer’s search also looked for matches at any of 13 genetic locations, while in a real criminal case the analyst would look for a particular profile — making a match far less likely.

    Further, any nonmatching markers would immediately rule out a suspect.

    And this quote (from your comment) is the winner of the piece. When you find ANY match, you use the full profile to (attempt to) rule out the innocent

    Lord Nazh (899dce)

  4. So, basically, they’re saying Barlow’s search was the equivalent of “find everyone with blonde hair and blue eyes”, while the standard used for evidence is more along the lines of “everyone with blonde hair, blue eyes, 6’2″, 220-240 lbs, and matching this photo”?

    Rob Crawford (6c262f)

  5. You seem to have quite the blind spot when it comes to DNA. Contrary to your revisionism, Felch and Dolan did not “badly botch” the DNA issue last time around. They committed one significant error in that article, the prosecutor/selection fallacy, and that error served only to water down their story somewhat, making the junk science used to convict Puckett look slightly more reliable and slightly less junky that it actually was. Then, as now, you ignored the real story, focusing instead on that error and a host of other more minor ones, real and imagined. In so doing, you committed a couple of real whoppers yourself, namely (1) conflating the 1 in 1.1 million odds of a random match between records from unrelated individuals with the much higher odds of a not-so-random match between two records from the same individual, which RMP does not contemplate; and (2) making the aggregate 1 in 3 odds of an innocent match contingent on the irrelevant question of whether the killer’s own DNA was also lurking somewhere in the database. The Times never owned up to their more serious error, but then again, neither did you. [Fortunately or unfortunately, that liberal troll you recently banned for promising and failing to identify uncorrected errors of yours was too dumb to notice this example and I, wanting him gone anyway, wasn't too keen on helping him find it.]

    This time, you once again ignored the real story (which is that the FBI asks us all to trust its estimates of the likelihood of a match, while fighting tooth and nail against any effort to test these estimates empirically) and focused instead on an irrelevancy. Only in this case, the irrelevancy wasn’t an error at all, nor even a credible example of burying the lead/lede. It’s not as though your friends Felch and Dolan deliberately buried the paragraph in question deep in the article where they hoped no one would find it. Quite the contrary, it appears relatively early in the article, enough so to appear on page 1 among 5 online. Your only issue here seems to be that newspapers generally have a practice of squeezing in one or two paragraphs from many articles onto the front page, as opposed to running one or two in their entirety, and relegating everything else to the back.

    Care to comment on the substance of the article? There are some important issues in there, which are worthy of a good discussion. Hair-trigger charges of jump-powering are not it.

    Xrlq (b71926)

  6. The article’s message should have been not to do DNA searches the way Troyer did.

    Bradley J Fikes (0ea407)

  7. Isn’t what Troyer did a variation on the birthday problem? The likelihood of any randomly selected person having the same birthday as me is 1/365 (leaving leapyears aside). The likelihood of someone else in a group of 23 having the same birthday as me is 22/365, or about 6%. However, the likelihood that in a group of 23, two people will share the same birthday is slightly over 50%.

    Analogously, the likelihood that a given DNA sample will match another may be very small, even though the likelihood that there will be two matches in a very large group may be comparatively high.

    KRB (b93484)

  8. Certainly, the odds of anyone matching anyone else in the database are very high. The key question is whether they are roughly what the FBI predictions say they should be, or significantly higher or lower than that. That the FBI fights against having such studies done at all is, I submit, inexcusable. Do the studies, let the experts crunch the numbers, and if all the studies show is what they experts have been saying all along, great.

    Xrlq (b71926)

  9. The problem — and it’s real — is that the FBI is trying to hide the possible problem, rather than offering a proper, mathematically correct, explanation which would reveal that it is a non-problem. Perhaps it is a problem.

    htom (412a17)

  10. seconding xrlq #5 and htom #9. i’m not a geneticist or a statistician. i took away from the article the fbi’s fear that this knowledge would be disseminated and its scramble to cover up the science elucidated in the barlow search. it threatened states and their crime labs with expulsion from the program. after acknowledging that the arizona lab responded to a valid court order, it still maintained that release of the information was “unauthorized”. it prevailed upon the arizona people to get their own court order seeking to prevent barlow from sharing the results with her colleagues in the defense bar, but like a good lawyer anticipating negative developments, she had already done so. the supreme court in maryland got it right when it said that a defendant’s due process rights don’t hinge on an fbi employee’s determination that release of certain information is “unauthorized”.

    patterico, you can’t have it both ways. as an experienced trial prosecutor, there must have been many occasions when you reminded the jury in your summation of the defendant’s efforts to hide evidence/cover his tracks, and this is a persuasive line in this context. i followed the hans reiser murder trial from my seat in the internet jury box, and i voted guilty of murder, more than anything else because he removed and discarded the back seats of his wife’s car with no plausible reason (subsequently vindicated when he showed us the body in return for a murder 2 sentencing). the fbi is doing the exact same thing that hans reiser and the murderers you’ve prosecuted did. all available inferences will be drawn against the party who tries to hide something from me.

    assistant devil's advocate (a57c94)

  11. Maybe I misunderstand the Times’ article, but it appears to me that the problem is not so much how many points match, but the way statistics about the match are presented to juries regarding the likelihood of the match.

    After thinking about it, I am left wondering if the way DNA statistics are sometimes presented can be misleading. To explain, if the odds of a match–using a given number of points–with someone in the general population are–for example–ten million to one, then the odds of a match would go up anytime you restrict the database to anything smaller than the general population. Therefore, it seems to me that although accurate statistically, testifying to higher odds because of the database being smaller is misleading to the jury–almost to the point of dishonesty. As the old saying goes, there are lies, damn lies, and statistics. While I don’t think anyone started out to be deliberately misleading, I think there is a likelihood that people have fallen into the logical trap of trying to make odds of matches seem more incriminating than they are through the use of different sized databases when calculating those odds.

    In the end I doubt that the method of presenting odds has had much if any effect upon verdicts because I find it unlikely that a criminal case against someone has been presented using only DNA evidence as a means of proving guilt. While DNA evidence can be a powerful tool for prosecutors, there needs to be other evidence in order to connect the accused with the crime. After all, no competent prosecutor would present a case, even if the DNA matched, when it was provable beyond all doubt that the matched person was elsewhere at the time of the crime and therefore could not possibly be the guilty party, or that the person was unable to have committed the crime for one reason or another. Still, I do think we need to discuss the way in which odds of a match are presented and arrive at a means that is fair to both prosecution and defense and does not mislead the jury. As I said, restricting the database in order to raise the odds to astronomical levels appears misleading to me.

    Fritz (0c0f42)

  12. Here is a 14-page article that provides useful context to this issue–although most of the mathematics is too advanced for me to follow.

    Bruce S. Weir, “The Rarity of DNA Profiles” (link to pdf) The Annals of Applied Statistics 2: 358-370, 2007.

    Prof. Weir is in the Dept. of Biostatistics at the University of Washington. He’s a heavy hitter in the field; for a list of publications, enter “Weir BS” into the PubMed search engine.

    From the Introduction:

    There are forensic, statistical and genetic aspects to discussions of profile rarity. The key forensic issue centers on the comparison of two profiles, often one from a crime-scene sample and one from a suspect. The relevant calculations must recognize the existence of two profiles rather than focusing on only one of them. The statistical aspects are addressed initially by the “Birthday Problem.” The probability that a person chosen randomly has a particular birthday is 1/365, ignoring leap-year complications, but there is over 50% probability that two people in a group of 23 people share a birthday. This result recognizes that the number of pairs of people, 253, is much greater than the number of people, 23, and that the particular shared birthday is not specified. The finding of DNA profile matching in an Arizonan database of 65,000 profiles [Troyer, Gilroy and Koeneman (2001)] becomes less surprising when it is recognized that there are over two billion possible pairs of profiles in that database. The genetic aspects rest on the shared evolutionary history of humans. The very fact that the population is finite means that any two people have shared ancestors and the resulting dependencies increase the probability of profile matching.

    From the Discussion:

    Given the widespread belief that specific forensic profiles are rare, there has been some concern expressed at the finding of matching or nearly matching profiles in databases of less than 100,000. Such findings were predicted by Weir (2004), unaware that they had already been reported [Troyer, Gilroy and Koeneman (2001)] for the case of two profiles matching at nine of 13 loci. At the simplest level, the apparent discrepancy is merely an application of the birthday problem. If all DNA profiles have the same probability P, and if profiles are independent, then the probability of at least two instances of any profile in a set of n profiles is approximately 1−exp(−n^2*P/2). This probability can be large even for small P and it can be 50% when n is of the order of [1 divided by the square root of P].

    The citation to Kathryn Troyer’s presentation of the AZ database matches:

    Troyer, K., Gilroy, T. and Koeneman, B. (2001). A nine STR locus match between two apparent unrelated individuals using AmpFlSTR Profiler Plus and COfiler. Proceedings of the Promega 12th International Symposium on Human Identification.

    Google also came up with a good post by Charles Brenner dated January 8, 2007, entitled Arizona DNA database “matches”. He argues against errors made by NPR’s “Math Guy” Keith Devlin that might mirror one side in the present conflict.

    I conclude that:

    * The issues raised by these LAT articles have been widely discussed within the forensics community for some time.

    * In particular, Weir notes that the terms “Prosecutor’s Fallacy” and “Defense Attorney’s Fallacy” were coined in 1987 to describe the two respective misinterpretations of database odds that we have been debating.

    * People within law enforcement seem most concerned with defending against the Defense Attorney Fallacy, while the LAT and defense lawyers are most worried about the Prosecutor’s Fallacy.

    * This is a multifaceted issue that is ill-suited to a “here’s the simple story, white hats here and black hats over there” narrative.

    * Jurors in felony trials should know that “one in 5 quadrillion” odds don’t mean what they seem. Neither does “matches pop up shockingly often”–a fuller view of the probabilities (and of the impact of siblings and cousins in a database) means that Troyer’s surprising findings actually weren’t.

    AMac (c822c9)

  13. One more reason to do away with the death penalty on a federal level.

    Oiram (983921)

  14. This is the “Birthday Problem.” If you have 40 some people in a room, it is almost a sure bet that two of them share a birthday. The odds that any of them share YOUR birthday (or any other particular date) are much slimmer.

    In other words, this “research” means essentially nothing.

    Ron (986aaf)

  15. KRB — 7/21/2008 @ 6:55 am:

    Isn’t what Troyer did a variation on the birthday problem? …
    Analogously, the likelihood that a given DNA sample will match another may be very small, even though the likelihood that there will be two matches in a very large group may be comparatively high.

    assistant devil’s advocate — 7/21/2008 @ 8:15 am:

    seconding xrlq #5 and htom #9. i’m not a geneticist or a statistician. i took away from the article the fbi’s fear that this knowledge would be disseminated and its scramble to cover up the science elucidated in the barlow search.

    4X Bingo! KRB, xrlq, htom & ADA.

    The result is not unusual, or anomalous, or rare. The result is just another result of the well known birthday problem or birthday paradox. It’s not really a paradox, but some refer to it that way because it is counterintuitive to those with no understanding of statistics.

    That the FBI went nuts trying to cover up the completely uncontroversial empirical observation of a well known mathematical statistical result speaks volumes to its integrity. Is FBI leadership too stupid to understand a simple and well known statistical issue? Or are they are just attempting to use institutional power to prevent ordinary citizens from noticing a fundamental mathematical truth?

    Occasional Reader (998d0b)

  16. If profiles are independent may be the telling phrase. Perhaps there’s a dependency that the FBI does not want found, or publicized. (We already know that there’s a dependency, in that some families have many members in the databases.) How are the loci to be used determined?

    htom (412a17)

  17. Why are mathematical illiterates like these reporters allowed to hit this hobby horse again and again. Obviously they have some anti-scientific belief that DNA evidence is poor — probably because it made OJ look like a criminal — and so they twist their report to “prove” it.

    Just another variation of flat-earther. I blame Bush.

    Kevin Murphy (0b2493)

  18. Here’s a working link to Weir’s paper on xariv.

    htom (412a17)

  19. Certainly, the odds of anyone matching anyone else in the database are very high. The key question is whether they are roughly what the FBI predictions say they should be, or significantly higher or lower than that. That the FBI fights against having such studies done at all is, I submit, inexcusable. Do the studies, let the experts crunch the numbers, and if all the studies show is what they experts have been saying all along, great.

    This is the right answer.

    Further, while Felch and Dolan blew some of the statistics in the article, I don’t think many commenters here really would have done much better. Statistics and probability are not intuitive and people generally can’t engage in accurate probabilistic reasoning, at least not without considerable work. This is why people still argue about the Monty Hall problem, that people don’t immediately grasp Simpsons Paradox, or even some of the top experts blow it when evaluating choices under uncertainty leading to the Allais, Ellsberg and other paradoxes.

    That the FBI went nuts trying to cover up the completely uncontroversial empirical observation of a well known mathematical statistical result speaks volumes to its integrity. Is FBI leadership too stupid to understand a simple and well known statistical issue? Or are they are just attempting to use institutional power to prevent ordinary citizens from noticing a fundamental mathematical truth?

    I largely agree with this. One point I would add is that often times the statistics are stated as if the people are totally un-related, that is you pick two random people. Statistics and probabilities for other scenarios are not, and in the Puckett case, were explicitly prohibited from being introduced. As Xrlq rightly notes this makes the results dubious at best and…well junk at worst. That an article like this makes people re-evaluate the value of DNA evidence in establishing guilt as opposed to establishing innocence, then it is a good thing.

    Steve Verdon (4c0bd6)

  20. Never having taken a statistics course, the safest thing for me to say is, “Uh?” But the FBI doth seems protest over much.

    quasimodo (b903f8)

  21. Xrlq, I made neither error that you allege. But the last time I tried to explain why, you cut my explanation off in mid-sentence and used the out-of-context phrase to argue something I had made perfectly clear I was not arguing. As much as I would enjoy recreating that joyous experience, I think I’ll pass. I’ll just note that I did not make the errors you describe.

    As for the substance of the article: I think I’ll pass on trusting that the LAT set forth the issue fairly, despite their fine track record of evenhandedness. I doubt that the FBI guy really gave as terse a denial as portrayed. I have an e-mail in to him.

    Patterico (15b6ef)

  22. I agree with Patterico that the LAT story has a biased tone in that the near matches discussed do not appear to be actually unexpected. I agree with Xrlq that the FBI’s calculations should be subject to scrutiny by outside experts.

    While reading the LAT story I followed a link to this interesting story about searching databases for matches to relatives. Such searches would be more effective with a greater number of markers. And of course DNA technology is constantly improving. So perhaps the number of markers in a standard DNA profile should be increased.

    James B. Shearer (fc887e)

  23. But the last time I tried to explain why, you cut my explanation off in mid-sentence and used the out-of-context phrase to argue something I had made perfectly clear I was not arguing.

    Gosh, this sounds familiar…

    Drumwaster (5ccf59)

  24. Xrlq, I made neither error that you allege.

    Previously you refused to acknolwedge that those assertions were errors, now you deny making them at all? Really? So if I were to quote you as accusing Felch and Dolan of having “omitted a key assumption: that the database consisted entirely of innocent people,” as opposed to correctly assuming that the database includes roughly 338,000 innocent people (plus a very small but undetermined number of not-so-innocent ones and/or blood relatives, neither of whom are contemplated by RMP), then surely I’d be lying, right? As I would be if I were to suggest that you had portrayed the 1 in 3 figure, which represents the odds that one of 338,000 innocent non-relatives in the database will generate a random match, as though that figure were “really just a 1 in 3 chance of a ‘match, period.’” Just another dirty rotten lie by that old L.A. Times shill, Xrlq. Of course you can always play the “context” card, which is why I linked to the contexts so any reader can easily determine for himself/herself whether my summary was accurate or not. In that vein, I hereby offer a free box of cereal to the first commenter in this thread who can credibly demonstrate which of the two statements means something materially different in their original contexts than they do outside. Sorry, no box of cereal for the guy who tells us what the clause-level fisking in that thread had to do with either of the issues I raised here (i.e., nothing).

    If there turn out to be other problems with the article, such as the FBI being misquoted, I’m keen to hear them. However, that still wouldn’t justify this post, which made a much more specific allegation that is simply wrong (or if “right,” then only so under a theory broad enough to prove that every front-page story is objectionable if the entire story is not contained on the front page).

    Xrlq (b71926)

  25. And Drumwaster, don’t try to piggyback on this one. Patterico is smart guy with one particular blind spot. You’re an all-purpose dumbass, not even in the same league. Though I will grant you this much: you’re obviously not dumb enough to take up me up on my challenge. So maybe most-purpose dumbass is more like it.

    Xrlq (62cad4)

  26. Though I will grant you this much: you’re obviously not dumb enough to take up me up on my challenge.

    You first, Wade Walker…

    Drumwaster (5ccf59)

  27. Your quotes do not support the alleged errors.

    If you want to find someone reasonable to argue the point on your behalf, be my guest. After the tactics you employed in the previous thread, I’ll not re-argue this issue with you.

    Patterico (cb443b)

  28. Drum: I already went first. You began the debate by barking up the wrong tree (the federal Constitution) so I replied by barking up one of the 50 right ones (state law, which in this case also happened to be a constitution). I chose California not because its education laws are particularly unusual (they’re not) but because California is (1) the state whose laws I know the most about (2) your state, (3) the state most closely connected with this blog and (4) by far the most populous state. Now it’s your turn to pick a state. You won’t, of course, because you know damned well that no matter what state you choose, the end result will be the same (aside from the irrelevant detail that the right in question may be codified in a regular statute or court decision rather than a constitution). So instead, you claim a phony victory based solely on the fact that I’ve got better things to do than to prove you wrong 49 more times, just to prove that the assumptions of 99% of Americans are in fact right and yours are wrong – while you, OTOH, would only have to prove me wrong once to make the whole issue go away. This is way off topic, though, so I’m done discussing the issue in this thread.

    Patterico: I’m a perfectly reasonable individual, thank you very much. I’m not particularly interested in re-arguing the merits of your earlier statements – they are what they are – but to the extent you think my paraphrases of your statements are inaccurate, it’s up to you to explain why. Simply saying they’re wrong, without further explanation, is about no more convincing than Felch’s insistence that your paraphrase of his statement was wrong. As was the case when you paraphrased them, I think your statement and my paraphrase mean the same thing. You say they don’t. I say fine, do as Felch and/or Dolan should have done in your case. Show us the difference, don’t just tell us there is one.

    Bottom line is, you either intended to say that the 1 in 3 figure (when presented as the likelihood that at least one innocent non-relative among 338,000 will randomly generate a match) is somehow dependent on whether the killer is in the database, or you did not. If you did, then you fundamentally misunderstood the RMP, its interplay with the prosecutor’s/selection fallacy, or both. If you did not intend to say that, then frankly, you did a pretty crummy job of communicating whatever your actual intent was. Either way, it’s your failure to correct, not to blame on my evil “tactics” (a lame allegation in its own right, as it implies that there’s something nefarious about addressing an error where it occurs rather than leaving the rest of the sentence for others to wade through in search of your point).

    If I were the only reader who came away with the above reading of your take on the 1 in 3 figure, you might be able to chalk this up to a reading comprehension failure on my part rather than a communication failure on yours. But I wasn’t the only reader who interpreted it that way; in fact, we have yet to hear from a single reader/commenter who interpreted it differently. At least one such reader, cfbleachers, not only interpreted your statement as I described, but actually believed that interpretation and went on to imply that the “leftist media” may have botched the story on purpose. Are you going to blame that misunderstanding on me and my evil “tactics,” too?

    Xrlq (b71926)

  29. Now it’s your turn to pick a state.

    Why? You haven’t provided evidence of what was asserted.

    Still waiting, Catvomit.

    Drumwaster (5ccf59)

  30. a lame allegation in its own right, as it implies that there’s something nefarious about addressing an error where it occurs rather than leaving the rest of the sentence for others to wade through in search of your point

    No, you didn’t address an error where it occurred, you cut off a sentence in mid-sentence to make it mean something that it didn’t mean. If you’re going to continue to justify that, there is no point in discussing any of it with you. Why increase my blood pressure arguing with someone who employs tactics like that? I won’t do it.

    Were to apologize for that tactic, I might consider discussing the matter. If you’re going to continue defending such a maddening and (in effect if not, I assume, intent) dishonest tactic, it’s just pointless trying to speak with you on this issue.

    Patterico (cb443b)

  31. Closer, but still misses the mark.

    Too bad you so-called “lawyers” can’t read simple English.

    Drumwaster (5ccf59)

  32. Drumwaster,

    You have a unique point of view that no one can argue against.

    And I finally understood your Dr. Demento reference. Go on enjoying your Wet Dream.

    You are on ignore from now on.

    nk (a4be1f)

  33. My bad.

    U.S. Constitution, Art. III: The judicial power of the United States shall be vested in one Drumwaster who can make up any shit he wants on a blog.

    Sorry for any disrespect, Your Honor.

    nk (a4be1f)

  34. Hey, clown! Do you think that murder is a crime in every jurisdiction in the United States? Or is that an assertion I must prove and it’s unfair to ask you to prove the negative?

    nk (a4be1f)

  35. Blah, blah, blah.

    If you can’t provide the evidence, maybe you should have kept your piehole tightly closed. Meanwhile your sandbox is out back, just waiting for you to start eating the contents again.

    Drumwaster (5ccf59)

  36. Regarding Xlrq’s argument in the first link he supplied, I’m with him.

    Suppose the database has the killer in it. Presumably that killer would be found. The chances that we’d get one or more additional matches? We can use the binomial distribution to calculate that and the probability is, 0.2645. Now suppose the killer is not in the database, then probability is, 0.2646. In short the difference is 0.0001, a trivial difference.

    Steve Verdon (94c667)

  37. Oh and Xlrq, despite what you might think makeing probabilisitic statements about past events is not nonesensical.

    I give you this Passage from Dale Poirier’s text book, Intermediate Statistics and Econometrics: A Comparative Approach

    The Frequency interpretation [of probability] cannot be applied: (i) to unique, once-and-for-all phenomenon (e.g. elections) (ii) to theories (e.g. “monetarists” or “Keynesian economics), or (iii) one uncertain past events (e.g. whether the Cleveland Indians won the 1954 World Series).9–emphasis added

    Do we, even today know that Puckett was guilty? No. Hence we don’t know if his selection was one of the innocent members of the database or the guilty party. It is an uncertain past event.

    However, the Logical and subjective interpretations of probability allow for such applications. So as a Bayesian I can talk about the selection of Puckett in probabilistic terms whereas a Frequentist cannot.

    Steve Verdon (4c0bd6)

  38. However, the Logical and subjective interpretations of probability allow for such applications. So as a Bayesian I can talk about the selection of Puckett in probabilistic terms whereas a Frequentist cannot.

    Why don’t we just weigh the defendant and see if he weighs as much as a duck?

    nk (a4be1f)

  39. No, you didn’t address an error where it occurred, you cut off a sentence in mid-sentence to make it mean something that it didn’t mean.

    Oh really. Here is the part I quoted:

    If you’re taking a situation where you received only one hit..

    In isolation, that part sure makes it sounds as though you had analyzed the 1 in 3 selection odds in terms of a fact that could not have been known until after the database search in question had been conducted – a classic example of the selection/prosecutor’s fallacy. But surely that wasn’t really your position, right? Surely if I had instead quoted the full paragraph, it would have become immediately obvious to any reader you were really talking about the selection odds before the database search occurred, and not committing the very same fallacy you had rightly called Felch and Dolan on, right? Er… no. Here it is, in full glory, untainted by Xrlq’s maddening and unintentionally dishonest (whatever the hell the latter is supposed to mean) tactics:

    If you’re taking a situation where you received only one hit, the most useful thing for the jury to know that can be based on the numbers we know is this: what are the chances that a search of a database of innocent people will result in a hit (which by definition will be a hit to an innocent person)? That’s what the database adjustment is there to illustrate. But the key assumption is that the database consists entirely of innocent people.

    In other words, my little snippet made it sound as though your theory about the selection odds assumed only one fact that was unknowable before the database search occurred (how many hits there would be), when in reality, your theory assumed two such facts, one of which is not always knowable even after the search has been conducted (i.e., we know the database consisted entirely of innocents if no hits turned up, but under any other circumstances we have to guess), and which in any event was not a part of the theory you were purporting to apply (which merely assumes that the database includes approximately 338,000 unique records from innocent non-relatives, not that it excludes anything else). In other words, my maddening, “unintentionally dishonest” tactics made it look as though you were wrong on one level, when you were really wrong on three. My bad.

    Were to apologize for that tactic, I might consider discussing the matter.

    Aside from making your position look less unreasonable than it actually was, WTF am I supposed to apologize to you for? You’re the one who has consistently made this personal, throwing around labels like “dishonest” or “maddening,” and speciously accusing me of misrepresenting your position even while steadfastly refusing to clarify what your original point supposedly was. Previously I asked you the same goddamned question you (rightly) asked Felch and Dolan when they (wrongly) accused you of doing the same to them. Thus far, you’ve handled that question about as well as they did, so I ask again. Show me precisely how my paraphrases of your statements supposedly differ in substance from the actual positions you were attempting to communicate, and I’ll give due consideration to whether or not any apology is in order. But from where it sits right now, you are the one owing an apology, not for being full of crap on the underlying issue (although you clearly are), but for being a dick about it.

    Xrlq (62cad4)

  40. Steve: I never said you can never make probabalistic statements about past events. Of course you can. What I did say – or meant to say, anyway – is that you can’t simply apply the original selection odds after the fact, without adjusting for scenarios that formed part of the original odds but can now be ruled out. That, plus the fact that when determining which of two possible explanations is more likely to be correct (here, that there was either that 1 in 3 match per the RMP or some unknown possibility that the killer himself was in there to generate a match) you have to have a fair amount of knowledge about both probabilities, not just one. If a database search yields only one match, and the original odds of the killer being in the database were greater than 1 in 3, then it’s more likely that that one hit was to the killer. If the odds were less than 1 in 3, then it’s more likely that the hit was to an innocent non-relative.

    Xrlq (62cad4)

  41. OK, you’re right (sort of) about me having originally misinterpreted your statement in the previous thread. I say “sort of” because while I I did indeed misread your point in that thread, that misreading had zero/zip/nada to do with the fact that I responded at exactly the point where I saw you going astray, rather than quoting back the rest of the paragraph. Whatever misunderstandings may result from “cutting someone off mid-sentence” in real life are wholly irrelevant here, as I read your entire comment before responding to any of it. But as I’ve shown above, reading the entire comment in context didn’t help, as the entire comment lends itself equally well to the same misinterpretation, and more.

    Your point in that thread, as I understand it, was that once a database search has been run, something special happens if it returns just one search, and that somehow affects how the original 1 in 3 odds ought to be presented to the jury. Is that a fair summary of your position? It’s still wrong, of course, but before explaining why, I’d like to know if I’m at least barking up the right tree this time around.

    Xrlq (62cad4)

  42. I’m coming back to this after a period of time, and I see that you tried to make some kind of effort, so I’ll spend a second to respond. If you start employing unfair tactics I’ll quit instantly.

    Your point in that thread, as I understand it, was that once a database search has been run, something special happens if it returns just one search, and that somehow affects how the original 1 in 3 odds ought to be presented to the jury. Is that a fair summary of your position? It’s still wrong, of course, but before explaining why, I’d like to know if I’m at least barking up the right tree this time around.

    You’re getting closer. If there is just one search, then you have to be careful about how you present the fact that the odds of a hit to an innocent person are 1 in 3. You have to be especially careful not to phrase it in a way so that jurors mistakenly *conclude* what the L.A. Times actually *falsely stated* in their original article — namely, that the 1 in 3 number means there is a 1 in 3 chance that the guy sitting in the defendant’s chair is innocent.

    The best way, in my view, to avoid that false conclusion by jurors is to state the odds by using an *assumption* that the database is composed only of innocents, and then note that even in that scenario, there’s a 1 in 3 chance of a hit — which, by necessity, would be a hit to an innocent.

    It’s not necessary to make that assumption in order to correctly state that there is a 1 in 3 chance of a hit to an innocent person. BUT when you have actually run the search, and received only one hit, even educated people will misread that statement as a statement of the odds of innocence of the guy who was hit.

    That’s my position. Please don’t distort it; please try to understand it; and maybe we can come to a meeting of the minds on this.

    Patterico (cc3b34)

  43. Fair enough, but it sounds like you are arguing strategy rather than alleging an actual, objective error. Strategy is more of an art than a science. You may think assuming a perp-free database is the best way to avoid confusion over pre- and post-odds. I tend to disagree, but no matter. What matters is that it’s certainly not the only way the data can be presented to avoid that confusion, and for that reason alone, “error” is an unduly strong word to use for a disagreement over strategy. Let’s assume for argument’s sake, though, that it is correct. Let’s assume that the only way it is possible to describe the 1 in 3 figure without confusing jurors as to selection odds vs. after the fact odds is to posit a perp-free database. In that case, would the following be a fair summary of your two outstanding objections to the May article:

    1. At one point in the article, Felch and Dolan confused the selection odds with the odds after the fact (i.e. committed the prosecutor’s/selection fallacy).
    2. At another point in the article, Felch and Dolan presented technically accurate data in a manner which, given the context, would have led a reasonable reader to commit exactly the same error they themselves had committed in #1 above.

    Is that a fair characterization of your two objections? If not, where did I go wrong? And if so, isn’t it clear we’re really talking about one error here rather than two?

    Xrlq (b71926)

  44. Fair enough, but it sounds like you are arguing strategy rather than alleging an actual, objective error. Strategy is more of an art than a science. You may think assuming a perp-free database is the best way to avoid confusion over pre- and post-odds. I tend to disagree, but no matter. What matters is that it’s certainly not the only way the data can be presented to avoid that confusion, and for that reason alone, “error” is an unduly strong word to use for a disagreement over strategy.

    Their statement that there was a 1 in 3 chance that Puckett was innocent was an error.

    Their description of the issue, while omitting what I have described as a key assumption (that the database consisted of innocent people), was not an “error” as such, but rather contributed to a misleading impression — namely, that the 1 in 3 statistic related to Puckett’s innocence.

    As I have explained many times, the fact that the search revealed only 1 hit, and that hit was to Puckett, means that most descriptions of a 1 in 3 chance of a hit to an innocent person run a serious chance of misleading even smart people into thinking that the statistic refers to the one hit that emerged. This misleading impression is compounded when an erroneous verb tense is used. But even if the correct tense is used, people are likely to be misled. It has happened on this site more than once.

    That is why I susbscribe to David Kaye’s preferred method of explaining the significance of the adjustment. Namely, if the jury knows a database search was run, they should ideally be told that, assuming the database consists of unrelated innocent individuals, there is still a 1 in x chance of a hit, which would necessarily be a hit to an innocent person, given the assumption.

    Patterico (cc3b34)

  45. From reading Kaye’s paper, and from his email response, I’m not convinced he really understands the issue, either. I note he’s a law professor, not a statistician as I had previously assumed. Nevertheless, the 1 in 3 figure (as a representiatve of the likelihood of a match to an innocent non-relative) does not assume the database consists exclusively of innocents, only that it contains approximately 338,000 such records. This is true whether the perp is in there or not. So rather than mislead the jury into believing that the 1 in 3 figure is somehow dependent on the presence or absence of the perp in the database (a misconception that is every bit as far off the mark as implying that one-third of all Californians would match the profile), why not instead tell them what clearly IS true, namely:

    1. There were 1 in 3 odds that the search would yield a random match to an innocent non-relative of the perp. If those 1 in 3 odds materialized in this case, then Puckett was that innocent non-relative.
    2. There were 1 in n odds – and we have no idea what the value of n is, so we really shouldn’t be trying this case at all, but what the hell, you’re the jury so go ahead and give it a SWAG – that the perp himself was in the database, in which case he almost certainly would have yielded a match. If those 1 in n odds materalized, Puckett was that perp.
    3. This particular database search yielded only one hit, so either #1 or #2 has occurred here, but not both.
    4. Therefore, depending on what random value you the jury end up arbitrarily assigning to n, it is either more likely that Puckett is the perp than an innocent non-relative, or it is more likely that he is an innocent non-relative than the perp, or the odds are roughly 50-50. If your value of n is greater than 3, then based on the DNA evidence alone it is more likely that Puckett is an innocent non-relative than the perp. If your value of n is 3, or very close to it, then based on DNA evidence alone, the odds are about 50-50 that Puckett is the perp. If your value of n is 1.5, then it is twice as likely that Puckett was guilty rather than innocent, and by a bizarre coincidence, L.A. Times’s 1 in 3 figure was right, but that’s still not good enough to get past reasonable doubt (though it could be if these 1 in 3 odds of guilt based on DNA evidence are viewed in conjunction with equally probative non-DNA evidence).

    Sound ridiculous? Sure, but that became a given as soon as jeopardy attached in this case. The only non-ridiculous thing to have done would be to have dropped the case because we have no clue what the value of n is, and the rest of the evidence is so thin that there’s no way a jury could properly convicted the guy beyond a reasonable doubt – and in fact, it’s far from clear that there was even probable cause.

    Xrlq (62cad4)


Powered by WordPress.

Page loaded in: 0.7080 secs.