Interesting reactions to my post about the “jam experiment,” which purported to show that consumers bought less jam when confronted with more choices.
One fella named Chip S. came along to say that the whole argument was a ridiculous strawman; nobody really believes that result.
Chip S. was annoyed because I had posited, without doing research on it, that this study would be the type of thing Big Media and Freakonomics types would write about:
You can easily see journalists writing an article that wows the public. Call the Freakonomics guys. Can’t you envision a section of a chapter talking about this surprising result, and discussing the likely reasons for it? Perhaps the shoppers were paralyzed by indecision when presented with so many choices. Valuable information, certainly, for any marketer to know.
Valuable — and almost certainly wrong.
Was I wrong that this would be the sort of study Big Media and Freakonomics types would jump on?
Please. Would I ask the question so publicly if the answer were unflattering to my predictive powers?
I looked this evening. There is a New York Times article pushing the idea. There is a book that features the study as an example of how choice can paralyze people. There are blog posts making the same point. And plenty of my commenters came out in support of the jam study’s conclusion — seemingly overlooking Manzi’s assertion (as reported by my post) that, if anything, studies show the opposite of what the jam experiment purports to show.
Why, as it turns out, even the Freakonomics people wrote about it. What is especially funny about their post is that they clearly love the counter-intuitive finding, admit that it fits their preconceptions — and then quote someone showing the study’s results cannot be replicated consistently . . . and then . . . and then conclude that, hey, maybe people should streamline choices anyway:
So even if jam studies of the future prove inconclusive, it still seems wise to streamline choices whose complexity might otherwise hamper a good outcome.
In other words, the result is just so much fun, it’s a shame to toss it overboard just because it cannot be shown to be accurate.
People love this so much, I thought I would look for Manzi talking about this in a written format, and let him make the argument himself, rather than have me report to you about something I heard in a podcast. So I found Manzi discussing this in more detail at the Corner and thought it was worth quoting at length:
What are the odds that we would see one randomly chosen group of about 100 of the people who were given a coupon have a redemption rate that is ten times as large as another similarly sized random group of people given the exact same coupon? It’s larger than you might think. Consider an example. A recent in-store coupon executed by a large-format grocery-store chain was distributed to more than 1.3 million shoppers. I randomly divided them into about 13,000 groups of 100 shoppers each. I then randomly paired each of these groups with one other, creating about 6,500 randomly matched pairs of randomly selected groups of 100 shoppers. In a little over 9 percent of these pairings, the redemption rate was at least ten times as high in one group as in its matched pair. The jam experiment, by this simplified and indicative metric, would fail to achieve standard measures of statistical confidence required to reject the hypothesis that this was just random variation.
And while the specifics will vary for any given coupon – based on characteristics like product category, average redemption rate, time of year, and so forth – this indicative analysis almost certainly understates the actual probability of seeing this much difference between the two groups in the experiment. The two groups of jam buyers were not assigned randomly. Because the experiment was done for a total of ten hours in only one store, and because shoppers were grouped in hourly chunks, there are all kinds of reasons why the people who happened to show up during the five hours of limited assortment might have different propensity to respond to one-dollar-off coupons for a specific line of jams than those who arrived in the other five hour period. Maybe a soccer game finished at some specific time, and several of the parents who share similar propensities versus the average shopper came in nearly together, or maybe a bad traffic jam in a part of town with non-average propensity to respond to the coupon dissuaded several people from going to the store at one time versus another. Remember, all of the inference is built on the purchase of a grand total of 35 jars of jam. This is one reason why rigorous retail experiments, when a lot of money is at stake, are typically executed for dozens of randomly assigned stores for a period of weeks — and even sample sizes like that are pushing the envelope of causal inference.
But the result is at least interesting, and the right way to figure out whether or not the result is valid and generalizable is replication. Over the past ten years, a number of such experiments have been done by academics to evaluate the asserted paradox of choice for product categories ranging from mp3 players to mutual funds, and a paper was published in February (Scheibehenne, et al.) that conducted a meta-analysis of 50 of them (h/t Tim Harford). Across all of these experiments, the average effect of increasing choice on consumption or satisfaction was “virtually zero.” Further, this meta-analysis showed a positive average effect of increasing choices for those experiments that, like the jam experiment, tested the effect of choice on consumption quantity, rather than some measure of satisfaction, as the outcome. That is, when it comes to sales, more choice is better.
This is consistent with all of the unpublished assortment experiments that I’ve seen, and should not be surprising. As a store adds more and more products to a given product line assortment – say, canned soup – sales will rise sub-linearly with product count.
The key, again, is whether repeated experiments produce a predictable result — not how much fun the answer is, or whether it is in line with your preconceptions.
It’s a hard lesson to remember, but I think it’s a valuable one.