{"title": "Bayesian Belief Polarization", "book": "Advances in Neural Information Processing Systems", "page_first": 853, "page_last": 861, "abstract": "Situations in which people with opposing prior beliefs observe the same evidence and then strengthen those existing beliefs are frequently offered as evidence of human irrationality. This phenomenon, termed belief polarization, is typically assumed to be non-normative. We demonstrate, however, that a variety of cases of belief polarization are consistent with a Bayesian approach to belief revision. Simulation results indicate that belief polarization is not only possible but relatively common within the class of Bayesian models that we consider.", "full_text": "Bayesian Belief Polarization\n\nAlan Jern\n\nDepartment of Psychology\nCarnegie Mellon University\n\najern@cmu.edu\n\nKai-min K. Chang\n\nLanguage Technologies Institute\n\nCarnegie Mellon University\nkkchang@cs.cmu.edu\n\nCharles Kemp\n\nDepartment of Psychology\nCarnegie Mellon University\n\nckemp@cmu.edu\n\nAbstract\n\nEmpirical studies have documented cases of belief polarization, where two peo-\nple with opposing prior beliefs both strengthen their beliefs after observing the\nsame evidence. Belief polarization is frequently offered as evidence of human\nirrationality, but we demonstrate that this phenomenon is consistent with a fully\nBayesian approach to belief revision. Simulation results indicate that belief po-\nlarization is not only possible but relatively common within the set of Bayesian\nmodels that we consider.\n\nSuppose that Carol has requested a promotion at her company and has received a score of 50 on an\naptitude test. Alice, one of the company\u2019s managers, began with a high opinion of Carol and became\neven more con\ufb01dent of her abilities after seeing her test score. Bob, another manager, began with a\nlow opinion of Carol and became even less con\ufb01dent about her quali\ufb01cations after seeing her score.\nOn the surface, it may appear that either Alice or Bob is behaving irrationally, since the same piece\nof evidence has led them to update their beliefs about Carol in opposite directions. This situation is\nan example of belief polarization [1, 2], a widely studied phenomenon that is often taken as evidence\nof human irrationality [3, 4].\nIn some cases, however, belief polarization may appear much more sensible when all the relevant\ninformation is taken into account. Suppose, for instance, that Alice was familiar with the aptitude\ntest and knew that it was scored out of 60, but that Bob was less familiar with the test and assumed\nthat the score was a percentage. Even though only one interpretation of the score can be correct,\nAlice and Bob have both made rational inferences given their assumptions about the test.\nSome instances of belief polarization are almost certain to qualify as genuine departures from ra-\ntional inference, but we argue in this paper that others will be entirely compatible with a rational\napproach. Distinguishing between these cases requires a precise normative standard against which\nhuman inferences can be compared. We suggest that Bayesian inference provides this normative\nstandard, and present a set of Bayesian models that includes cases where polarization can and can-\nnot emerge. Our work is in the spirit of previous studies that use careful rational analyses in order\nto illuminate apparently irrational human behavior (e.g. [5, 6, 7]).\nPrevious studies of belief polarization have occasionally taken a Bayesian approach, but often the\ngoal is to show how belief polarization can emerge as a consequence of approximate inference in\na Bayesian model that is subject to memory constraints or processing limitations [8]. In contrast,\nwe demonstrate that some examples of polarization are compatible with a fully Bayesian approach.\nOther formal accounts of belief polarization have relied on complex versions of utility theory [9],\nor have focused on continuous hypothesis spaces [10] unlike the discrete hypothesis spaces usually\nconsidered by psychological studies of belief polarization. We focus on discrete hypothesis spaces\nand require no additional machinery beyond the basics of Bayesian inference.\nWe begin by introducing the belief revision phenomena considered in this paper and developing a\nBayesian approach that clari\ufb01es whether and when these phenomena should be considered irrational.\nWe then consider several Bayesian models that are capable of producing belief polarization and\nillustrate them with concrete examples. Having demonstrated that belief polarization is compatible\n\n1\n\n\fFigure 1: Examples of belief updating behaviors for two individuals, A (solid line) and B (dashed\nline). The individuals begin with different beliefs about hypothesis h1. After observing the same set\nof evidence, their beliefs may (a) move in opposite directions or (b) move in the same direction.\n\nwith a Bayesian approach, we present simulations suggesting that this phenomenon is relatively\ngeneric within the space of models that we consider. We \ufb01nish with some general comments on\nhuman rationality and normative models.\n\n1 Belief revision phenomena\n\nThe term \u201cbelief polarization\u201d is generally used to describe situations in which two people observe\nthe same evidence and update their respective beliefs in the directions of their priors. A study by\nLord, et al. [1] provides one classic example in which participants read about two studies, one of\nwhich concluded that the death penalty deters crime and another which concluded that the death\npenalty has no effect on crime. After exposure to this mixed evidence, supporters of the death\npenalty strengthened their support and opponents strengthened their opposition.\nWe will treat belief polarization as a special case of contrary updating, a phenomenon where two\npeople update their beliefs in opposite directions after observing the same evidence (Figure 1a).\nWe distinguish between two types of contrary updating. Belief divergence refers to cases in which\nthe person with the stronger belief in some hypothesis increases the strength of his or her belief\nand the person with the weaker belief in the hypothesis decreases the strength of his or her belief\n(Figure 1a(i)). Divergence therefore includes cases of traditional belief polarization. The opposite\nof divergence is belief convergence (Figure 1a(ii)), in which the person with the stronger belief\ndecreases the strength of his or her belief and the person with the weaker belief increases the strength\nof his or her belief. Contrary updating may be contrasted with parallel updating (Figure 1b), in\nwhich the two people update their beliefs in the same direction. Throughout this paper, we consider\nonly situations in which both people change their beliefs after observing some evidence. All such\nsituations can be unambiguously classi\ufb01ed as instances of parallel or contrary updating.\nParallel updating is clearly compatible with a normative approach, but the normative status of di-\nvergence and convergence is less clear. Many authors argue that divergence is irrational, and many\nof the same authors also propose that convergence is rational [2, 3]. For example, Baron [3] writes\nthat \u201cNormatively, we might expect that beliefs move toward the middle of the range when people\nare presented with mixed evidence.\u201d (p. 210) The next section presents a formal analysis that chal-\nlenges the conventional wisdom about these phenomena and clari\ufb01es the cases where they can be\nconsidered rational.\n\n2 A Bayesian approach to belief revision\n\nSince belief revision involves inference under uncertainty, Bayesian inference provides the appro-\npriate normative standard. Consider a problem where two people observe data d that bear on some\nhypothesis h1. Let P1(\u00b7) and P2(\u00b7) be distributions that capture the two people\u2019s respective beliefs.\nContrary updating occurs whenever one person\u2019s belief in h1 increases and the other person\u2019s belief\nin h1 decreases, or when\n\n[P1(h1|d) \u2212 P1(h1)] [P2(h1|d) \u2212 P2(h1)] < 0 .\n\n(1)\n\n2\n\n0.50.50.5beliefsUpdatedbeliefsPriorbeliefsPriorbeliefsPriorbeliefsBAParallel updatingContrary updatingDivergenceConvergence(b)(i)(ii)Updated(a)beliefsUpdatedP(h1)\fFigure 2: (a) A simple Bayesian network that cannot produce either belief divergence or belief\nconvergence. (b) \u2013 (h) All possible three-node Bayes nets subject to the constraints described in\nthe text. Networks in Family 1 can produce only parallel updating, but networks in Family 2 can\nproduce both parallel and contrary updating.\n\nWe will use Bayesian networks to capture the relationships between H, D, and any other variables\nthat are relevant to the situation under consideration. For example, Figure 2a captures the idea that\nthe data D are probabilistically generated from hypothesis H. The remaining networks in Figure 2\nshow several other ways in which D and H may be related, and will be discussed later.\nWe assume that the two individuals agree on the variables that are relevant to a problem and agree\nabout the relationships between these variables. We can formalize this idea by requiring that both\npeople agree on the structure and the conditional probability distributions (CPDs) of a network N\nthat captures relationships between the relevant variables, and that they differ only in the priors they\nassign to the root nodes of N. If N is the Bayes net in Figure 2a, then we assume that the two people\nmust agree on the distribution P (D|H), although they may have different priors P1(H) and P2(H).\nIf two people agree on network N but have different priors on the root nodes, we can create a single\nexpanded Bayes net to simulate the inferences of both individuals. The expanded network is created\nby adding a background knowledge node B that sends directed edges to all root nodes in N, and acts\nas a switch that sets different root node priors for the two different individuals. Given this expanded\nnetwork, distributions P1 and P2 in Equation 1 can be recovered by conditioning on the value of the\nbackground knowledge node and rewritten as\n\n[P (h1|d, b1) \u2212 P (h1|b1)] [P (h1|d, b2) \u2212 P (h1|b2)] < 0\n\n(2)\n\n(3)\n\nwhere P (\u00b7) represents the probability distribution captured by the expanded network.\nSuppose that there are exactly two mutually exclusive hypotheses. For example, h1 and h0 might\nstate that the death penalty does or does not deter crime. In this case Equation 2 implies that contrary\nupdating occurs when\n\n[P (d|h1, b1) \u2212 P (d|h0, b1)] [P (d|h1, b2) \u2212 P (d|h0, b2)] < 0 .\n\nEquation 3 is derived in the supporting material, and leads immediately to the following result:\n\nR1: If H is a binary variable and D and B are conditionally independent given\nH, then contrary updating is impossible.\n\nResult R1 follows from the observation that if D and B are conditionally independent given H, then\nthe product in Equation 3 is equal to (P (d|h1) \u2212 P (d|h0))2, which cannot be less than zero.\nR1 implies that the simple Bayes net in Figure 2a is incapable of producing contrary updating, an\nobservation previously made by Lopes [11]. Our analysis may help to explain the common intuition\nthat belief divergence is irrational, since many researchers seem to implicitly adopt a model in which\nH and D are the only relevant variables. Network 2a, however, is too simple to capture the causal\nrelationships that are present in many real world situations. For example, the promotion example at\nthe beginning of this paper is best captured using a network with an additional node that represents\nthe grading scale for the aptitude test. Networks with many nodes may be needed for some real\nworld problems, but here we explore the space of three-node networks.\nWe restrict our attention to connected graphs in which D has no outgoing edges, motivated by the\nidea that the three variables should be linked and that the data are the \ufb01nal result of some generative\nprocess. The seven graphs that meet these conditions are shown in Figures 2b\u2013h, where the addi-\ntional variable has been labeled V . These Bayes nets illustrate cases in which (b) V is an additional\n\n3\n\nVHDHDHDHDHDHDHDDHFamily 2(b)(c)(d)(e)(f)(g)(h)(a)Family 1VVVVVV\fConventional wisdom Family 1\n\nBelief divergence\nBelief convergence\nParallel updating\n\n(cid:88)\n(cid:88)\n\n(cid:88)\n\nFamily 2\n\n(cid:88)\n(cid:88)\n(cid:88)\n\nModels\n\nTable 1: The \ufb01rst column represents the conventional wisdom about which belief revision phenom-\nena are normative. The models in the remaining columns include all three-node Bayes nets. This set\nof models can be partitioned into those that support both belief divergence and convergence (Family\n2) and those that support neither (Family 1).\n\npiece of evidence that bears on H, (c) V informs the prior probability of H, (d)\u2013(e) D is generated\nby an intervening variable V , (f) V is an additional generating factor of D, (g) V informs both the\nprior probability of H and the likelihood of D, and (h) H and D are both effects of V . The graphs\nin Figure 2 have been organized into two families. R1 implies that none of the graphs in Family 1 is\ncapable of producing contrary updating. The next section demonstrates by example that all three of\nthe graphs in Family 2 are capable of producing contrary updating.\nTable 1 compares the two families of Bayes nets to the informal conclusions about normative ap-\nproaches that are often found in the psychological literature. As previously noted, the conventional\nwisdom holds that belief divergence is irrational but that convergence and parallel updating are\nboth rational. Our analysis suggests that this position has little support. Depending on the causal\nstructure of the problem under consideration, a rational approach should allow both divergence and\nconvergence or neither.\nAlthough we focus in this paper on Bayes nets with no more than three nodes, the class of all network\nstructures can be partitioned into those that can (Family 2) and cannot (Family 1) produce contrary\nupdating. R1 is true for Bayes nets of any size and characterizes one group of networks that belong\nto Family 1. Networks where the data provide no information about the hypotheses must also fail\nto produce contrary updating. Note that if D and H are conditionally independent given B, then\nthe left side of Equation 3 is equal to zero, meaning contrary updating cannot occur. We conjecture\nthat all remaining networks can produce contrary updating if the cardinalities of the nodes and the\nCPDs are chosen appropriately. Future studies can attempt to verify this conjecture and to precisely\ncharacterize the CPDs that lead to contrary updating.\n\n3 Examples of rational belief divergence\n\nWe now present four scenarios that can be modeled by the three-node Bayes nets in Family 2.\nOur purpose in developing these examples is to demonstrate that these networks can produce belief\ndivergence and to provide some everyday examples in which this behavior is both normative and\nintuitive.\n\n3.1 Example 1: Promotion\n\nWe \ufb01rst consider a scenario that can be captured by Bayes net 2f, in which the data depend on two\nindependent factors. Recall the scenario described at the beginning of this paper: Alice and Bob\nare responsible for deciding whether to promote Carol. For simplicity, we consider a case where\nthe data represent a binary outcome\u2014whether or not Carol\u2019s r\u00b4esum\u00b4e indicates that she is included\nin The Directory of Notable People\u2014rather than her score on an aptitude test. Alice believes that\nThe Directory is a reputable publication but Bob believes it is illegitimate. This situation is repre-\nsented by the Bayes net and associated CPDs in Figure 3a. In the tables, the hypothesis space H =\n{\u2018Unquali\ufb01ed\u2019 = 0, \u2018Quali\ufb01ed\u2019 = 1} represents whether or not Carol is quali\ufb01ed for the promotion,\nthe additional factor V = {\u2018Disreputable\u2019 = 0, \u2018Reputable\u2019 = 1} represents whether The Directory\nis a reputable publication, and the data variable D = {\u2018Not included\u2019 = 0, \u2018Included\u2019 = 1} repre-\nsents whether Carol is featured in it. The actual probabilities were chosen to re\ufb02ect the fact that only\nan unquali\ufb01ed person is likely to pad their r\u00b4esum\u00b4e by mentioning a disreputable publication, but that\n\n4\n\n\fFigure 3: The Bayes nets and conditional probability distributions used in (a) Example 1: Promotion,\n(b) Example 2: Religious belief, (c) Example 3: Election polls, (d) Example 4: Political belief.\n\nonly a quali\ufb01ed person is likely to be included in The Directory if it is reputable. Note that Alice\nand Bob agree on the conditional probability distribution for D, but assign different priors to V and\nH. Alice and Bob therefore interpret the meaning of Carol\u2019s presence in The Directory differently,\nresulting in the belief divergence shown in Figure 4a.\nThis scenario is one instance of a large number of belief divergence cases that can be attributed to two\nindividuals possessing different mental models of how the observed evidence was generated. For\ninstance, suppose now that Alice and Bob are both on an admissions committee and are evaluating a\nrecommendation letter for an applicant. Although the letter is positive, it is not enthusiastic. Alice,\nwho has less experience reading recommendation letters interprets the letter as a strong endorsement.\nBob, however, takes the lack of enthusiasm as an indication that the author has some misgivings [12].\nAs in the promotion scenario, the differences in Alice\u2019s and Bob\u2019s experience can be effectively\nrepresented by the priors they assign to the H and V nodes in a Bayes net of the form in Figure 2f.\n\n3.2 Example 2: Religious belief\n\nWe now consider a scenario captured by Bayes net 2g. In our example for Bayes net 2f, the status\nof an additional factor V affected how Alice and Bob interpreted the data D, but did not shape their\nprior beliefs about H. In many cases, however, the additional factor V will in\ufb02uence both people\u2019s\nprior beliefs about H as well as their interpretation of the relationship between D and H. Bayes net\n2g captures this situation, and we provide a concrete example inspired by an experiment conducted\nby Batson [13].\nSuppose that Alice believes in a \u201cChristian universe:\u201d she believes in the divinity of Jesus Christ and\nexpects that followers of Christ will be persecuted. Bob, on the other hand, believes in a \u201csecular\nuniverse.\u201d This belief leads him to doubt Christ\u2019s divinity, but to believe that if Christ were divine,\nhis followers would likely be protected rather than persecuted. Now suppose that both Alice and\nBob observe that Christians are, in fact, persecuted, and reassess the probability of Christ\u2019s divinity.\nThis situation is represented by the Bayes net and associated CPDs in Figure 3b. In the tables, the\nhypothesis space H = {\u2018Human\u2019 = 0, \u2018Divine\u2019 = 1} represents the divinity of Jesus Christ, the\nadditional factor V = {\u2018Secular\u2019 = 0, \u2018Christian\u2019 = 1} represents the nature of the universe, and\nthe data variable D = {\u2018Not persecuted\u2019 = 0, \u2018Persecuted\u2019 = 1} represents whether Christians are\nsubject to persecution. The exact probabilities were chosen to re\ufb02ect the fact that, regardless of\nworldview, people will agree on a \u201cbase rate\u201d of persecution given that Christ is not divine, but that\nmore persecution is expected if the Christian worldview is correct than if the secular worldview is\ncorrect. Unlike in the previous scenario, Alice and Bob agree on the CPDs for both D and H, but\n\n5\n\nV2HDHDHDHDVHP(D=1)000.5010.1100.1110.9VP(H=1)00.110.9VHP(D=1)000.4010.01100.4110.6VP(H=1)01112030VP(D=0)P(D=1)P(D=2)P(D=3)00.70.10.10.110.10.70.10.120.10.10.70.130.10.10.10.7V1V2P(H=1)000.5010.1100.5110.9V2P(D=1)00.110.9BP(V=1)Alice0.01Bob0.9BP(H=1)Alice0.6Bob0.4BP(V=1)Alice0.9Bob0.1BP(V=0)P(V=1)P(V=2)P(V=3)Alice0.60.20.10.1Bob0.10.10.20.6BP(V1=1)Alice0.9Bob0.1BP(V2=1)Alice0.5Bob0.5(a)(b)(c)(d)VVVV1\fFigure 4: Belief revision outcomes for (a) Example 1: Promotion, (b) Example 2: Religious belief,\n(c) Example 3: Election polls, and (d) Example 4: Political belief. In all four plots, the updated\nbeliefs for Alice (solid line) and Bob (dashed line) are computed after observing the data described\nin the text. The plots con\ufb01rm that all four of our example networks can lead to belief divergence.\n\ndiffer in the priors they assign to V . As a result, Alice and Bob disagree about whether persecution\nsupports or undermines a Christian worldview, which leads to the divergence shown in Figure 4b.\nThis scenario is analogous to many real world situations in which one person has knowledge that\nthe other does not. For instance, in a police interrogation, someone with little knowledge of the case\n(V ) might take a suspect\u2019s alibi (D) as strong evidence of their innocence (H). However, a detective\nwith detailed knowledge of the case may assign a higher prior probability to the subject\u2019s guilt\nbased on other circumstantial evidence, and may also notice a detail in the suspect\u2019s alibi that only\nthe culprit would know, thus making the statement strong evidence of guilt. In all situations of this\nkind, although two people possess different background knowledge, their inferences are normative\ngiven that knowledge, consistent with the Bayes net in Figure 2g.\n\n3.3 Example 3: Election polls\n\nWe now consider two qualitatively different cases that are both captured by Bayes net 2h. The\nnetworks considered so far have all included a direct link between H and D.\nIn our next two\nexamples, we consider cases where the hypotheses and observed data are not directly linked, but are\ncoupled by means of one or more unobserved causal factors.\nSuppose that an upcoming election will be contested by two Republican candidates, Rogers and\nRudolph, and two Democratic candidates, Davis and Daly. Alice and Bob disagree about the various\ncandidates\u2019 chances of winning, with Alice favoring the two Republicans and Bob favoring the two\nDemocrats. Two polls were recently released, one indicating that Rogers was most likely to win the\nelection and the other indicating that Daly was most likely to win. After considering these polls,\nthey both assess the likelihood that a Republican will win the election.\nThis situation is represented by the Bayes net and associated CPDs in Figure 3c.\nIn the tables,\nthe hypothesis space H = {\u2018Democrat wins\u2019 = 0, \u2018Republican wins\u2019 = 1} represents the winning\nparty, the variable V = {\u2018Rogers\u2019 = 0, \u2018Rudolph\u2019 = 1, \u2018Davis\u2019 = 2, \u2018Daly\u2019 = 3} represents the\nwinning candidate, and the data variables D1 = D2 = {\u2018Rogers\u2019 = 0, \u2018Rudolph\u2019 = 1, \u2018Davis\u2019 =\n2, \u2018Daly\u2019 = 3} represent the results of the two polls. The exact probabilities were chosen to re\ufb02ect\nthe fact that the polls are likely to re\ufb02ect the truth with some noise, but whether a Democrat or\nRepublican wins is completely determined by the winning candidate V . In Figure 3c, only a single\nD node is shown because D1 and D2 have identical CPDs. The resulting belief divergence is shown\nin Figure 4c.\nNote that in this scenario, Alice\u2019s and Bob\u2019s different priors cause them to discount the poll that\ndisagrees with their existing beliefs as noise, thus causing their prior beliefs to be reinforced by the\nmixed data. This scenario was inspired by the death penalty study [1] alluded to earlier, in which\na set of mixed results caused supporters and opponents of the death penalty to strengthen their\nexisting beliefs. We do not claim that people\u2019s behavior in this study can be explained with exactly\nthe model employed here, but our analysis does show that selective interpretation of evidence is\nsometimes consistent with a rational approach.\n\n6\n\n00.5100.5100.5100.51PriorbeliefsPriorbeliefsPriorbeliefsUpdatedbeliefsUpdatedbeliefsUpdatedbeliefs(a)(b)(c)(d)ABPriorbeliefsUpdatedbeliefsP(H=1)\f3.4 Example 4: Political belief\n\nWe conclude with a second illustration of Bayes net 2h in which two people agree on the inter-\npretation of an observed piece of evidence but disagree about the implications of that evidence. In\nthis scenario, Alice and Bob are two economists with different philosophies about how the federal\ngovernment should approach a major recession. Alice believes that the federal government should\nincrease its own spending to stimulate economic activity; Bob believes that the government should\ndecrease its spending and reduce taxes instead, providing taxpayers with more spending money. A\nnew bill has just been proposed and an independent study found that the bill was likely to increase\nfederal spending. Alice and Bob now assess the likelihood that this piece of legislation will improve\nthe economic climate.\nThis scenario can be modeled by the Bayes net and associated CPDs in Figure 3d. In the tables, the\nhypothesis space H = {\u2018Bad policy\u2019 = 0, \u2018Good policy\u2019 = 1} represents whether the new bill is\ngood for the economy and the data variable D = {\u2018No spending\u2019 = 0, \u2018Spending increase\u2019 = 1}\nrepresents the conclusions of the independent study. Unlike in previous scenarios, we introduce two\nadditional factors, V 1 = {\u2018Fiscally conservative\u2019 = 0, \u2018Fiscally liberal\u2019 = 1}, which represents the\noptimal economic philosophy, and V 2 = {\u2018No spending\u2019 = 0, \u2018Spending increase\u2019 = 1}, which\nrepresents the spending policy of the new bill. The exact probabilities in the tables were chosen to\nre\ufb02ect the fact that if the bill does not increase spending, the policy it enacts may still be good for\nother reasons. A uniform prior was placed on V 2 for both people, re\ufb02ecting the fact that they have\nno prior expectations about the spending in the bill. However, the priors placed on V 1 for Alice and\nBob re\ufb02ect their different beliefs about the best economic policy. The resulting belief divergence\nbehavior is shown in Figure 4d. The model used in this scenario bears a strong resemblance to the\nprobabilogical model of attitude change developed by McGuire [14] in which V 1 and V 2 might be\nlogical \u201cpremises\u201d that entail the \u201cconclusion\u201d H.\n\n4 How common is contrary updating?\n\nWe have now described four concrete cases where belief divergence is captured by a normative\napproach.\nIt is possible, however, that belief divergence is relatively rare within the Bayes nets\nof Family 2, and that our four examples are exotic special cases that depend on carefully selected\nCPDs. To rule out this possibility, we ran simulations to explore the space of all possible CPDs for\nthe three networks in Family 2.\nWe initially considered cases where H, D, and V were binary variables, and ran two simulations\nfor each model.\nIn one simulation, the priors and each row of each CPD were sampled from a\nsymmetric Beta distribution with parameter 0.1, resulting in probabilities highly biased toward 0\nand 1. In the second simulation, the probabilities were sampled from a uniform distribution. In\neach trial, a single set of CPDs were generated and then two different priors were generated for\neach root node in the graph to simulate two individuals, consistent with our assumption that two\nindividuals may have different priors but must agree about the conditional probabilities. 20,000\ntrials were carried out in each simulation, and the proportion of trials that led to convergence and\ndivergence was computed. Trials were only counted as instances of convergence or divergence if\n|P (H = 1|D = 1) \u2212 P (H = 1)| > \u0001 for both individuals, with \u0001 = 1 \u00d7 10\u22125.\nThe results of these simulations are shown in Table 2. The supporting material proves that diver-\ngence and convergence are equally common, and therefore the percentages in the table show the\nfrequencies for contrary updating of either type. Our primary question was whether contrary updat-\ning is rare or anomalous. In all but the third simulation, contrary updating constituted a substantial\nproportion of trials, suggesting that the phenomenon is relatively generic. We were also interested\nin whether this behavior relied on particular settings of the CPDs. The fact that percentages for the\nuniform distribution are approximately the same or greater than for the biased distribution indicates\nthat contrary updating appears to be a relatively generic behavior for the Bayes nets we considered.\nMore generally, these results directly challenge the suggestion that normative accounts are not suited\nfor modeling belief divergence.\nThe last two columns of Table 2 show results for two simulations with the same Bayes net, the\nonly difference being whether V was treated as 2-valued (binary) or 4-valued. The 4-valued case\nis included because both Examples 3 and 4 considered multi-valued additional factor variables V .\n\n7\n\n\f2-valued V\n\n4-valued V\n\nBiased\nUniform\n\n9.6%\n18.2%\n\n12.7%\n16.0%\n\n0%\n0%\n\n23.3%\n20.0%\n\nTable 2: Simulation results. The percentages indicate the proportion of trials that produced con-\ntrary updating using the speci\ufb01ed Bayes net (column) and probability distributions (row). The\nprior and conditional probabilities were either sampled from a Beta(0.1, 0.1) distribution (bi-\nased) or a Beta(1, 1) distribution (uniform). The probabilities for the simulation results shown\nin the last column were sampled from a Dirichlet([0.1, 0.1, 0.1, 0.1]) distribution (biased) or a\nDirichlet([1, 1, 1, 1]) distribution (uniform).\n\nIn Example 4, we used two binary variables, but we could have equivalently used a single 4-valued\nvariable. Belief convergence and divergence are not possible in the binary case, a result that is\nproved in the supporting material. We believe, however, that convergence and divergence are fairly\ncommon whenever V takes three or more values, and the simulation in the last column of the table\ncon\ufb01rms this claim for the 4-valued case.\nGiven that belief divergence seems relatively common in the space of all Bayes nets, it is natural\nto explore whether cases of rational divergence are regularly encountered in the real world. One\npossible approach is to analyze a large database of networks that capture everyday belief revision\nproblems, and to determine what proportion of networks lead to rational divergence. Future studies\ncan explore this issue, but our simulations suggest that contrary updating is likely to arise in cases\nwhere it is necessary to move beyond a simple model like the one in Figure 2a and consider several\ncausal factors.\n\n5 Conclusion\n\nThis paper presented a family of Bayes nets that can account for belief divergence, a phenomenon\nthat is typically considered to be incompatible with normative accounts. We provided four concrete\nexamples that illustrate how this family of networks can capture a variety of settings where belief\ndivergence can emerge from rational statistical inference. We also described a series of simulations\nthat suggest that belief divergence is not only possible but relatively common within the family of\nnetworks that we considered.\nOur work suggests that belief polarization should not always be taken as evidence of irrationality,\nand that researchers who aim to document departures from rationality may wish to consider alterna-\ntive phenomena instead. One such phenomenon might be called \u201cinevitable belief reinforcement\u201d\nand occurs when supporters of a hypothesis update their belief in the same direction for all possible\ndata sets d. For example, a gambler will demonstrate inevitable belief reinforcement if he or she\nbecomes increasingly convinced that a roulette wheel is biased towards red regardless of whether\nthe next spin produces red, black, or green. This phenomenon is provably inconsistent with any fully\nBayesian approach, and therefore provides strong evidence of irrationality.\nAlthough we propose that some instances of polarization are compatible with a Bayesian approach,\nwe do not claim that human inferences are always or even mostly rational. We suggest, however,\nthat characterizing normative behavior can require careful thought, and that formal analyses are\ninvaluable for assessing the rationality of human inferences. In some cases, a formal analysis will\nprovide an appropriate baseline for understanding how human inferences depart from rational norms.\nIn other cases, a formal analysis will suggest that an apparently irrational inference makes sense once\nall of the relevant information is taken into account.\n\n8\n\nVHDVHDVHDVHD\fReferences\n[1] C. G. Lord, L. Ross, and M. R. Lepper. Biased assimilation and attitude polarization: The\neffects of prior theories on subsequently considered evidence. Journal of Personality and\nSocial Psychology, 37(1):2098\u20132109, 1979.\n\n[2] L. Ross and M. R. Lepper. The perseverance of beliefs: Empirical and normative considera-\ntions. In New directions for methodology of social and behavioral science: Fallible judgment\nin behavioral research. Jossey-Bass, San Francisco, 1980.\n\n[3] J. Baron. Thinking and Deciding. Cambridge University Press, Cambridge, 4th edition, 2008.\n[4] A. Gerber and D. Green. Misperceptions about perceptual bias. Annual Review of Political\n\nScience, 2:189\u2013210, 1999.\n\n[5] M. Oaksford and N. Chater. A rational analysis of the selection task as optimal data selection.\n\nPsychological Review, 101(4):608\u2013631, 1994.\n\n[6] U. Hahn and M. Oaksford. The rationality of informal argumentation: A Bayesian approach\n\nto reasoning fallacies. Psychological Review, 114(3):704\u2013732, 2007.\n\n[7] S. Sher and C. R. M. McKenzie. Framing effects and rationality. In N. Chater and M. Oaksford,\neditors, The probablistic mind: Prospects for Bayesian cognitive science. Oxford University\nPress, Oxford, 2008.\n\n[8] B. O\u2019Connor. Biased evidence assimilation under bounded Bayesian rationality. Master\u2019s\n\nthesis, Stanford University, 2006.\n\n[9] A. Zimper and A. Ludwig. Attitude polarization. Technical report, Mannheim Research Insti-\n\ntute for the Economics of Aging, 2007.\n\n[10] A. K. Dixit and J. W. Weibull. Political polarization. Proceedings of the National Academy of\n\nSciences, 104(18):7351\u20137356, 2007.\n\n[11] L. L. Lopes. Averaging rules and adjustment processes in Bayesian inference. Bulletin of the\n\nPsychonomic Society, 23(6):509\u2013512, 1985.\n\n[12] A. Harris, A. Corner, and U. Hahn. \u201cDamned by faint praise\u201d: A Bayesian account. In A. D. De\nGroot and G. Heymans, editors, Proceedings of the 31th Annual Conference of the Cognitive\nScience Society, Austin, TX, 2009. Cognitive Science Society.\n\n[13] C. D. Batson. Rational processing or rationalization? The effect of discon\ufb01rming information\non a stated religious belief. Journal of Personality and Social Psychology, 32(1):176\u2013184,\n1975.\n\n[14] W. J. McGuire. The probabilogical model of cognitive structure and attitude change. In R. E.\nPetty, T. M. Ostrom, and T. C. Brock, editors, Cognitive Responses in Persuasion. Lawrence\nErlbaum Associates, 1981.\n\n9\n\n\f", "award": [], "sourceid": 599, "authors": [{"given_name": "Alan", "family_name": "Jern", "institution": null}, {"given_name": "Kai-min", "family_name": "Chang", "institution": null}, {"given_name": "Charles", "family_name": "Kemp", "institution": null}]}