IntroductionCentral to theories of decision-making is the notion that human information processing is limited in capacity1. One classic manifestation of these constraints, when faced with the demands of a complex decision environment, is the context-dependent nature of human preference: our preference for an option depends not only that option’s own intrinsic value, but also the values of other, often irrelevant options2,3,4. Consider for example the choice a shopper faces between two wines, where the value of an option is computed across two dimensions: quality and price. When deliberating about which wine to choose among more than two desirable but competing options, a key tenet of rational (economic) choice—the “independence of irrelevant alternatives”—prescribes that decision-makers should ignore irrelevant, inferior alternative options5,6. According to this axiom, a shopper’s propensity to choose between two otherwise equally preferred wines should be not influenced by the introduction of a third option that is objectively inferior to one of the focal options. However, a large body of work suggests that both people7,8,9,10 and animals11,12 routinely violate this axiom in their decisions, finding that the introduction of irrelevant “decoy” options into a choice set systematically biases decision-making3.Research examining decoy effects date back several decades, with early studies conducted by Huber and colleagues7 and Tversky and Simonson4 demonstrating that individuals’ relative preference between two different options could be, under specific circumstances, biased by the introduction of a third less attractive option. Since these initial studies, a spate of work has investigated the cognitive mechanisms underpinning decoy effects in consumer choice3,13,14,15,16,17,18,19. Despite these theoretical advancements, the specific conditions under which decoy effects take hold remain the subject of debate20,21. For example, concerted replication attempts by Frederick et al. and Yang and Lynn found that decoy effects could only be observed in stylized laboratory settings using numeric (as opposed to visual) representations of options’ attribute values22,23 (see also ref. 24 for an attempt to replicate these results in a field experiment).These heterogenous results highlight how the observability of such decoy depends on, among other things, the configuration of options, attribute types, and product categories constituting the choice set in question21. While well-controlled laboratory studies have characterized the phenomenology of decoy effects and have informed our computational understanding of context effects15,18,25,26, relatively little work has investigated the extent to which decoy effects actually occur “in the wild” in larger option sets, as exemplified in real-world consumer choice. Previous studies have been limited in finding real-world evidence or these sorts of decoy effects manifest in shoppers’ choices, because, for example, the products under consideration were not amenable to quantifying option quality27, or required aggregation across consumer choices, obscuring potentially interesting features of both the consumers and the choice environment that may drive preference28. With respect to the latter point, unlike in tightly constrained laboratory settings, the choice sets facing real-world decision-makers are often arbitrarily large (e.g., a choice of 20–30 different wines in a convenience store), composed of options whose attribute values may not be readily observable (e.g., the quality of wine), and informed by decision-makers’ idiosyncratic past experiences with the options.Understanding if, and how such decoy effects play out in consumers’ real-world purchasing decisions is of considerable practical and theoretical importance. Practically, even small decoy effects in real-world consumer choice can hold sizable economic consequences both for consumers and businesses—a 1% change in choice behavior, in the aggregate, can lead to hundreds of thousands of dollars in profits or losses28. Theoretically, interrogating these effects “in the wild” is not only important for understanding the boundary conditions of these well-studied laboratory effects (e.g.,29,30,31), but also to examine if other moderating variables that exist in complex real-world decision environments allow them to be expressed at all22. This latter point is particularly important, given recent proposals in the literature that the true magnitude of decoy effects may be overstated by the artificial, albeit well-controlled, nature of tasks employed in laboratory choice studies22,23,32,33,34 in contrast to naturalistic consumer choice, where multiple sources of bias may influence individuals’ preferences (e.g., anchoring effects;35). Moreover, experimental work on decoy effects has focused almost exclusively on trinary choice paradigms, in which the preference between two target options is tested in the presence (versus absence) of a single decoy option22,23. Real-world consumer choices settings, however, often entail a choice between several options, with the possibility that multiple, inferior decoys are present and could systematically bias choices between (superior) target options. An open question, therefore, concerns how clusters of decoy options influences real-world choice, beyond contrived trinary choice situations.In the present study we introduce a novel approach to probe for a canonical decoy effect in large choice sets— termed the attraction effect—in a massive consumer purchasing dataset. To illustrate the attraction effect, consider a scenario in which a consumer considers two wines: a high quality, but expensive wine (Wine A) and an inexpensive, but low-quality wine (Wine B; see Fig. 1A). Critically, the attributes defining these two options—price and quality—trade off, such that neither wine is superior to the other wine in both attributes. This results in indifference between the two options: on average, consumers should choose Wine A 50% of the time and Wine B 50% of the time (Fig. 1A). The attraction effect occurs if a third, ‘distractor’ option is introduced into the choice set which is similar, but, inferior to one of the focal options. Specifically, the attraction effect takes hold when the distractor option is ‘dominated’ by one (but not both) of the target options—that is, Wine A is superior in both price and quality than the distractor option, but Wine B is superior in only one dimension (quality; see Fig. 1B). The attraction effect is characterized by an increased preference for Wine A over Wine B, despite the fact that decision-makers have equal preference for the two target options in the absence in the absence of distractor items3,7,15. An equivalent, but opposite, attraction effect also obtains when the distractor option is dominated by Wine B, but not Wine A, resulting in stronger preference for Wine B (Fig. 1C).Fig. 1: Toy illustration of the attraction effect.Illustration of the canonical attraction effect with a single distractor. A When nodistractor options are present, individuals are indifferent between Wine A and Wine B. B When Wine A dominates the distractor or ‘decoy’ option, preference shifts towards Wine A. C When Wine B dominates the distractor or ‘decoy’ option, preference shiftstowards Wine B.Full size imageHere we investigate if the phenomenon of decoy effects occurs in real-world consumer purchasing decisions, where individuals encounter a multitude of choices rather than a single decoy option. We analyze 3.6 M wine purchases in a retail dataset from a large grocery chain in the United Kingdom. We opted to examine wine choice as an instantiation of multi-attribute decision-making because it has a number of desirable properties for the present analysis. First, consumer psychology researchers have extensively examined wine purchasing behavior and have developed rich characterizations of decision variables, preferences, and motivations36,37. In particular, this body of work highlights the combined influence of internal (e.g., consumption history) and external (e.g., price, perceived quality) sources of information upon wine purchasing decisions38,39,40. This complex, multi-attribute decision problem makes wine an ideal product category for examining decoy effects in real-world consumer choice. In particular, wines vary considerably on two important, quantifiable attributes—price and quality—which demonstrably (1) influence consumers’ purchase decisions, and (2) trade off with one another41, mirroring the design of traditional laboratory decoy effect experiments3,26. Second, wine bottles are typically sold in a standard size (unlike other categories such as snack foods or bottled water), which means that product size should not be a relevant attribute in choices between wines. Third, in our dataset, we observed that most consumers purchased only a single bottle wine during a store visit, which suggests that the available wine options effectively compete against one another.In this massive purchase dataset, we examine whether choices between pairs of target options—operationalized as popular wines that trade off in price and quality but are equally preferred across individuals—are influenced by varying contexts in which they appear. Because these ubiquitous target options appeared within diverse choice sets at different times, owing to variability in available wines across store locations, we rigorously test whether preferences between target pairs are systematically influenced by the presence of dominated distractor options. If so, this would suggest that the well-characterized attraction effect not only manifests in real-world decision-making contexts, but does so even in large choice sets and despite other environmental (e.g., time of day) and psychological variables (e.g., individual differences) moderating purchase decisions.Leveraging the scale and heterogeneity of these choice contexts—and extending beyond traditional laboratory trinary choice paradigms—which are typically comprised of a single distractor option3,8—we investigate how the distribution of (multiple) distractor options influences preference between target options, With this approach, we can also examine how the number of distractor options dominated by a target item impacts the magnitude of the observed attraction effect. Finally, we investigate the extent to which differences in consumers’ purchasing histories modulate the strength of the attraction effect, suggesting boundary conditions for these observed decoy effects in naturalistic settings22,23.MethodsHere we present our method to analyze decoy effects in large scale, real-world purchasing datasets, describing in detail the data sources and statistical procedures used. Readers seeking an intuitive understanding of the dataset and analyses are encouraged to advance to the Results section.DatasetGrocery transaction records, corresponding to purchases made using customer loyalty cards, were provided to us through a collaboration with a large a grocery store chain in the United Kingdom (see Fig. 2A). Each transaction record consists of a list of products, corresponding to a customer’s purchases on a visit to a particular store location. From these transaction records, we extracted all red and white wine purchases over a three-month period (225 unique red wines, 149 unique white wines), resulting in a total of roughly 11 M transaction across 1.2 M customers, from August to October 2019. For each product in a transaction, the following data were available: a unique store ID, a unique product ID, an anonymized customer ID (based on customer loyalty cards), the time and date at which the transaction took place, the sales price of the item at purchase, the quantity purchased. Additionally, a small proportion of transactions were marked as being on promotion or on discount, but as these purchases occurred infrequently in our dataset (5.6% of transactions), we excluded these records from our analyses, including the construction and analysis of choice sets described below.Fig. 2: Influence of product attributes on choice.A Average wine rating plotted as a function of wine price (economy) in choice sets analyzed. B Distributions of observed wine ratings (left panel) and wine prices (i.e., economy; right panel) in choice sets analyzed. C Visualization of wine purchase rates as a function of wine quality and price. Darker points represent more frequently purchased wines. D Example choice set where neither target wine dominates the set. Notice that the black diamond, the average distractor, is not dominated in either dimension: it is less expensive than the red target, but lower quality, and higher quality than the blue target, but more expensive.Full size imageDescriptive information about the full dataset is presented in Supplementary Table 1. To ensure that options effectively competed with each other, we excluded transactions for which more than one wine with the same product ID was purchased (20% of purchases, see Supplementary Fig. 1). Finally, we excluded transactions made in very small (or infrequently visited) stores, defined as the bottom 5th percentile of stores with respect to number of total transactions represented in the dataset, as these did not generate enough purchasing data to infer choice sets (described below). After applying these exclusions, our analyses examined 3.6 M wine purchases made by 755,158 unique customers (see Supplementary Table 1).Option attributesOur analysis of decoy effects considered two attributes: price and quality. Item price (in GBP) was directly available in the dataset as the purchase price. Following past work15, our analyses consider the negative value of this attribute (-1 ⨉ price)—which we termed “economy,” whereby higher values of the economy attribute reflect lower prices.Wine quality was estimated by computing the mean star rating of the item from Vivino (https://www.vivino.com), a popular website and smartphone app where users rate wines on a 1–5 star scale. We chose this proxy measure of wine quality for two reasons. First, by virtue of its large user base (29 M users), ratings are available for the vast majority of wines in our dataset. Second, past work finds that Vivino ratings reflect both consumers’ preferences and experts’ ratings, suggesting that these ratings are informative about ‘ground-truth’ wine quality, independently from price42,43. For each wine in the dataset, we computed a wine’s average rating of quality (in stars) from all user ratings provided for that wine. The median number of user ratings per wine analyzed was 448. Transactions pertaining to wines that did not appear in the Vivino ratings dataset were excluded from further analysis.Calculation of choice setsWe begin with a general description of our approach. To examine decoy effects in consumers’ wine choices, we first constructed choice sets which represented the plausible options that faced a consumer in store at the time of the purchase, guided by the following reasoning: if consumer A purchased wine X, consumer B purchased wine Y, and consumer Z purchased wine C at the same store on the same day, we infer that wines X, Y, and Z were part of the choice set that consumer A considered before choosing wine X. Thus, wines that are purchased on the same day at the same store together can be grouped across consumers, yielding inferred choice sets in which each observed wine purchase was presumably situated.To do this, we took the following data processing steps. First, to ensure that the considered wines were adequately represented in the dataset, we constrained our analysis to include only wines that listed at least 1000 purchases across all available transaction records). In other words, we reasoned that infrequently purchased wines would complicate our choice set inference process as it is unlikely that multiple consumers would purchase the same unpopular wine at the same store on the same day. Next, to infer the choice set surrounding a particular wine purchase indicated in a transaction—that is, the other wine options that the consumer faced at the time of choice—we identified all other wines purchased at the same store on the same day by cross-referencing product IDs, consumer IDs, store IDs and dates of purchase.Owing to the variability in store size (and consequently, selection of wines on offer), this approach resulted in choice sets of varying sizes. Our analyses was constrained to choice sets that contained between 5 and 20 wines (see Supplementary Fig. 2), which constituted the vast majority of total choice sets (85%), which resulted in 24,803 choice sets remaining in the final analysis (aggregated price and ratings of choice sets are visualized in Supplementary Fig. 3). We did this, chiefly, to ensure that the analyzed choice sets sizes were representative of choice set sizes typically encountered by consumers in our dataset. A consequence of this choice set inference approach is that very unpopular (i.e., infrequently purchased) wines appear sparsely in our transaction data. Thus, we note that these estimated choice sets represent a lower bound of the true sizes of the choice sets facing consumers, and accordingly, represent sets of viable options consumers were faced with, rather than a (necessarily) complete reconstruction of all possible options facing a consumer.Identification of target and decoy optionsTarget options were defined as pairs of wine that are popular (i.e., frequently purchased overall in the dataset) and equally preferred. Importantly, target options trade off along two attributes—here, economy and quality—such that one item is cheaper, but of lower quality and the other is more expensive, but of higher quality. In our dataset, we identified target pairs as the 20 most frequently purchased pairs of popular wines (across the entire dataset) that met the following criteria: (1) one wine’s economy was higher (i.e., average purchase price was lower) and had lower rated quality than the other, and (2) both wines were chosen, across choice sets, with roughly equal preference, such that the observed choice likelihoods were no less than 75% in favor of an option across choice sets. We did this to ensure that choice sets of interest had viable target options that with attributes that effectively traded off and were comparable with respect to overall popularity.While our approach mirrors traditional trinary choice paradigms insofar as the identification of target options—two target alternatives which trade off in attribute values3— choice sets in the present analysis can contain multiple distractor (non-target) rather than a single distractor option. For simplicity, we took the average quality (rating) and economy of each set of distractor options as a summary measure of the distribution of distractor options’ attribute values (Fig. 2B–D; we return to theoretical questions surrounding the analysis of multiple distractors in the Discussion).We then examined each inferred choice set in which these 20 previously identified target pairs appeared, categorizing each choice set according to whether (or not) one of the target options dominated the average of the distractor options, which yielded three possible scenarios: choice sets where the low economy/high quality wine dominates the average of the distractors (Fig. 2B), choice sets where the high economy/low quality wine dominates the average of the distractors (Fig. 2C), and choice sets where neither wine dominates the average of the distractors (i.e., with distractors being similarly priced and/or rated to targets; Fig. 2D).The key dependent measure in our analysis of decoy effects is the relative preference between target wines, computed across these three classifications of inferred choice sets.Individual differencesIn addition to examining aggregate preferences, we also examined whether individual differences in the frequency with which customers purchased wines influenced their sensitivity to the decoy effects. To do this, we categorized shoppers as Frequent (18,652 shoppers) versus Infrequent shoppers (31,526 shoppers) by performing a median split (to maintain roughly comparable sample sizes) upon the frequency with which their customer ID number appeared in the inferred choice sets described above. Shopper type (frequent versus infrequent) was subsequently included as a predictor variable (deviance-coded) in the choice model described below.Inferential statisticsWe estimated three regression models to test our three key hypotheses. First, we used a linear regression to examine the effects of quality and economy upon wine popularity, defined as the proportion of purchases of that wine (irrespective of choice set), which we log-transformed as these popularity scores were substantially skewed. Second, to examine decoy effects (our main analyses of interest), we estimated a logistic regression model to predict relative preference between target wines (e.g., a high-economy/low-quality wine versus a low economy/high quality wine) across inferred choice sets. Following previous work15, this regression only included choices made to one of the two target options. This model predicted relative preference between target options as a function of: (1) dominance category (whether the average of the distractor options was dominated by high economy/low quality target, the low economy/high quality target, or neither, which was taken as the intercept), (2) set size, to test for potential effects of the number of options in the choice set, (3) maximum quality of the choice set and (4) maximum economy in the choice set, to control for the possibility that the best option with respect to either choice dimension influenced choice, (5) average quality and (6) average economy of the choice set, to control for set-wide context effects2, and 7) whether the purchase occurred on a weekend (Friday to Sunday, coded (1) or weekday (Monday to Thursday; coded 0), because alcohol sales are known to increase precipitously on weekends44. We examined the main effects of each these variables upon wine preference, as well as their interactions with dominance category, such that the full choice model followed the following specification:$$\begin{array}{l}{\rm{Choos}}{{\rm{e}}}_{{\rm{A}}}\, \sim \,\left({\rm{Intercept}}\left[{\rm{Non}}-{\rm{Dominated}}\; {\rm{Set}}\right]\right.\\\left.\qquad\qquad\qquad\qquad+{\rm{Dominance}}\; {\rm{Category}}\left[{\rm{High}}\; {\rm{Econ;Low}}\; {\rm{Qual}}\right]\right.\\\left.\qquad\qquad\qquad\qquad+{\rm{Dominance}}\; {\rm{Category}}\left[{\rm{Low}}\; {\rm{Econ;High}}\;{\rm{Qual}}\right]\right)\,\\\qquad\qquad\qquad\qquad*\left({\rm{Set}}\;{\rm{Size}}* {\rm{Weekend}}\right)+{\rm{Max}}\left({\rm{Ratin}}{{\rm{g}}}_{{\rm{set}}}\right)+{\rm{Max}}\left({\rm{Pric}}{{\rm{e}}}_{{\rm{set}}}\right)\\\qquad\qquad\qquad\qquad+{\rm{Mean}}\left({\rm{Ratin}}{{\rm{g}}}_{{\rm{set}}}\right)+{\rm{Mean}}\left({\rm{Ratin}}{{\rm{g}}}_{{\rm{set}}}\right)\end{array}$$(where + refers to the addition of a main effect and * refers to an interaction term and associated main effects). All continuous variables were centered using their median value.Third, in follow-up analyses examining the proportion of dominated distractors, we estimated the same logistic regression model but replaced the “dominance category” predictor with continuous predictor variables representing the proportion of distractors dominated by the high-economy/low-quality target, and the proportion of distractors dominated by low the economy/high quality target (both of which were centered at their median value). All reported confidence intervals are at the 95% level. For each model we report its Cox-Snell Pseudo R2 value.ResultsInfluence of product attributes on choiceA precondition for examining decoy effects in multi-attribute choice is that the attributes effectively trade off—here, higher-quality wines should be more expensive and cheaper wines should, in general, be of lower quality. In validation of this, we observed a robust negative relationship between wine quality—computed from consumer ratings (see Methods)—and economy, such that that higher-rated wines were more expensive than lower rated wines (ratings predicting price: b = −1.76, p =