<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Author of DNA Article Responds to Accusation That the Other Side Was Not Fairly Represented</title>
	<atom:link href="http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/feed/" rel="self" type="application/rss+xml" />
	<link>http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/</link>
	<description>Harangues that just make sense</description>
	<pubDate>Fri, 08 Aug 2008 18:46:17 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
		<item>
		<title>By: Rich Rostrom</title>
		<link>http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339887</link>
		<dc:creator>Rich Rostrom</dc:creator>
		<pubDate>Thu, 08 May 2008 02:01:20 +0000</pubDate>
		<guid isPermaLink="false">http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339887</guid>
		<description>As I've written before: if the set of loci in the sample DNA have 1.1M variations, then there is about a 1/3 chance of finding a match by coincidence in a set of 338K profiles. The odds against a particular profile match are not impressive when the number of profiles is that large.

One would be very surprised to be dealt a royal flush: the odds against a royal flush are 1 to 649,740. But not if one was dealt 200,000 hands.

The DNA match in this case was a good lead and supportive evidence, but not close to proof. The LAT is basically right - there is a 1/3 chance of this search turning up a false positive. 1.1M to one against this guy matching, but only 3 to 1 against &lt;i&gt;someone&lt;/i&gt; matching.</description>
		<content:encoded><![CDATA[<p>As I&#8217;ve written before: if the set of loci in the sample DNA have 1.1M variations, then there is about a 1/3 chance of finding a match by coincidence in a set of 338K profiles. The odds against a particular profile match are not impressive when the number of profiles is that large.</p>
<p>One would be very surprised to be dealt a royal flush: the odds against a royal flush are 1 to 649,740. But not if one was dealt 200,000 hands.</p>
<p>The DNA match in this case was a good lead and supportive evidence, but not close to proof. The LAT is basically right - there is a 1/3 chance of this search turning up a false positive. 1.1M to one against this guy matching, but only 3 to 1 against <i>someone</i> matching.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: SPQR</title>
		<link>http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339777</link>
		<dc:creator>SPQR</dc:creator>
		<pubDate>Wed, 07 May 2008 20:27:22 +0000</pubDate>
		<guid isPermaLink="false">http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339777</guid>
		<description>Paul, I don't think you are following the discussion, the issue arose because the kind of comparison that could be done was a limited comparison with a limited number of loci.  This changed the statistics of the comparison.</description>
		<content:encoded><![CDATA[<p>Paul, I don&#8217;t think you are following the discussion, the issue arose because the kind of comparison that could be done was a limited comparison with a limited number of loci.  This changed the statistics of the comparison.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Guantonio</title>
		<link>http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339774</link>
		<dc:creator>Paul Guantonio</dc:creator>
		<pubDate>Wed, 07 May 2008 20:20:16 +0000</pubDate>
		<guid isPermaLink="false">http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339774</guid>
		<description>I have a question as well as a comment. 
When a cold hit is made and that person is apprehended. How often is a new DNA test done on the identified person to confirm that there was no error in the database? I would hope that it is routine to do a retest, in fact I would hope it is required preferably by law. 

It seems to me that the question is. How reliable is the match between the crime scene DNA and the DNA of the individual identified? This has nothing to do with the size of the database. The size of the database has NO influence on the likelihood that the match is in error. People seem to be confusing the database size with the contents of the database which are wholly independent. No matter how big the database gets none of the data is changed due to the increase of the database size. The only database that is relevant is the one consisting of all the people in the world including all those who have died since a given crime was committed. This is the only database that counts and the only one controlling the reliability of the identification.

When a search of a database is made what is searched for is: is there a match between the crime scene DNA and any of the DNA samples in the database? The question of reliability is  the chance of error when comparing two samples of DNA not where or how the non crime scene DNA was obtained.</description>
		<content:encoded><![CDATA[<p>I have a question as well as a comment.<br />
When a cold hit is made and that person is apprehended. How often is a new DNA test done on the identified person to confirm that there was no error in the database? I would hope that it is routine to do a retest, in fact I would hope it is required preferably by law. </p>
<p>It seems to me that the question is. How reliable is the match between the crime scene DNA and the DNA of the individual identified? This has nothing to do with the size of the database. The size of the database has NO influence on the likelihood that the match is in error. People seem to be confusing the database size with the contents of the database which are wholly independent. No matter how big the database gets none of the data is changed due to the increase of the database size. The only database that is relevant is the one consisting of all the people in the world including all those who have died since a given crime was committed. This is the only database that counts and the only one controlling the reliability of the identification.</p>
<p>When a search of a database is made what is searched for is: is there a match between the crime scene DNA and any of the DNA samples in the database? The question of reliability is  the chance of error when comparing two samples of DNA not where or how the non crime scene DNA was obtained.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daryl Herbert</title>
		<link>http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339635</link>
		<dc:creator>Daryl Herbert</dc:creator>
		<pubDate>Wed, 07 May 2008 14:55:25 +0000</pubDate>
		<guid isPermaLink="false">http://patterico.com/2008/05/07/author-of-dna-article-responds-to-accusation-that-the-other-side-was-not-fairly-represented/#comment-339635</guid>
		<description>I laid out the math in my comments #39 and #41 in &lt;a href="http://patterico.com/2008/05/06/volokh-on-dna-and-cold-hits/#comment-339496" rel="nofollow"&gt;your most recent thread&lt;/a&gt;

The short version is: the probability that it's a good match, as opposed to an innocent hit, &lt;b&gt;when a single hit only is returned&lt;/b&gt;, depends on how likely the DB is to contain the perpetrator's DNA.

If the DB is 100% likely to contain his DNA, the chance of an innocent match is 0.  If he's in there, and there's only 1 result, it's going to be him!

If the DB is 0% likely to contain his DNA, the chance of an innocent match is 100%.  If he's not in the DB, it can't spit him out as an answer!

If the DB has a likelihood of P to contain his DNA, the chance of a guilty match, where 1-in-1.1M people will have matching DNA by chance, and the DB is of size 338k, is:

73.54 x P / ( (73.54 x P) + 22.60 x (1-P) )

If there's only a 1% chance the DB would have his DNA in it, then a single match returned carries only about a 3% chance the "match" is guilty.

At 50/50 odds, that's a 1-in-4 chance of innocence.

At 70/30 odds, that's only a 1-in-11 chance of innocence.

Assuming 50/50 odds, which is what the LAT does, is wrong.  You can't make the assumption that there's a 50% every defendant is going to be in the DB.  That actually &lt;i&gt;undercounts&lt;/i&gt; the likelihood of an innocent match in many cases, because for certain types of crimes, the DNA DB is unlikely to contain the perp!  The LAT: biased in favor of the police!

So you should ask the question: &lt;i&gt;based on the circumstances of the crime&lt;/i&gt;, what is the likelihood that the perp is in the DB?  That means looking to the type of crime committed (a stranger-rape-murder 3 decades ago) and the type of DB (a sex offender DNA registry).  The police could probably figure out these statistics if they wanted to--they can probably tell you have many 3-decades-old stranger-rape-murder perpetrators they believe are in the DB vs. they believe remain at large (or have since died without being caught).*

But the police aren't going to bother with that.  They want to use their &lt;i&gt;investigation&lt;/i&gt; to get to the bottom of innocence/guilt and not rely on statistics.  It would be unfair to criminal defendants if the sole evidence against them was a DNA match that could belong to about 30 California men, and a Police expert testifying that, based on the circumstances of the crime, there is probably a 90% chance that the perp is in the DB, so therefore the suspect is guilty.  This testimony would be confusing and probably prejudicial to the defendant.  It's really only of use to journalists and to people sizing up our criminal justice system as a whole (law profs, think tanks, etc.).

* if the police come back with a: we think there's a 10% chance 50% of them are in the DB, and a 20% chance 70% of them are in the DB, and a 30% chance 80% are in the DB, and a 40% chance 90% of them are in the DB--if that's what they come back with--then you're on your own, because I'm all mathed out.</description>
		<content:encoded><![CDATA[<p>I laid out the math in my comments #39 and #41 in <a href="http://patterico.com/2008/05/06/volokh-on-dna-and-cold-hits/#comment-339496" rel="nofollow">your most recent thread</a></p>
<p>The short version is: the probability that it&#8217;s a good match, as opposed to an innocent hit, <b>when a single hit only is returned</b>, depends on how likely the DB is to contain the perpetrator&#8217;s DNA.</p>
<p>If the DB is 100% likely to contain his DNA, the chance of an innocent match is 0.  If he&#8217;s in there, and there&#8217;s only 1 result, it&#8217;s going to be him!</p>
<p>If the DB is 0% likely to contain his DNA, the chance of an innocent match is 100%.  If he&#8217;s not in the DB, it can&#8217;t spit him out as an answer!</p>
<p>If the DB has a likelihood of P to contain his DNA, the chance of a guilty match, where 1-in-1.1M people will have matching DNA by chance, and the DB is of size 338k, is:</p>
<p>73.54 x P / ( (73.54 x P) + 22.60 x (1-P) )</p>
<p>If there&#8217;s only a 1% chance the DB would have his DNA in it, then a single match returned carries only about a 3% chance the &#8220;match&#8221; is guilty.</p>
<p>At 50/50 odds, that&#8217;s a 1-in-4 chance of innocence.</p>
<p>At 70/30 odds, that&#8217;s only a 1-in-11 chance of innocence.</p>
<p>Assuming 50/50 odds, which is what the LAT does, is wrong.  You can&#8217;t make the assumption that there&#8217;s a 50% every defendant is going to be in the DB.  That actually <i>undercounts</i> the likelihood of an innocent match in many cases, because for certain types of crimes, the DNA DB is unlikely to contain the perp!  The LAT: biased in favor of the police!</p>
<p>So you should ask the question: <i>based on the circumstances of the crime</i>, what is the likelihood that the perp is in the DB?  That means looking to the type of crime committed (a stranger-rape-murder 3 decades ago) and the type of DB (a sex offender DNA registry).  The police could probably figure out these statistics if they wanted to&#8211;they can probably tell you have many 3-decades-old stranger-rape-murder perpetrators they believe are in the DB vs. they believe remain at large (or have since died without being caught).*</p>
<p>But the police aren&#8217;t going to bother with that.  They want to use their <i>investigation</i> to get to the bottom of innocence/guilt and not rely on statistics.  It would be unfair to criminal defendants if the sole evidence against them was a DNA match that could belong to about 30 California men, and a Police expert testifying that, based on the circumstances of the crime, there is probably a 90% chance that the perp is in the DB, so therefore the suspect is guilty.  This testimony would be confusing and probably prejudicial to the defendant.  It&#8217;s really only of use to journalists and to people sizing up our criminal justice system as a whole (law profs, think tanks, etc.).</p>
<p>* if the police come back with a: we think there&#8217;s a 10% chance 50% of them are in the DB, and a 20% chance 70% of them are in the DB, and a 30% chance 80% are in the DB, and a 40% chance 90% of them are in the DB&#8211;if that&#8217;s what they come back with&#8211;then you&#8217;re on your own, because I&#8217;m all mathed out.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.076 seconds -->
