{"title": "Heuristics for Ordering Cue Search in Decision Making", "book": "Advances in Neural Information Processing Systems", "page_first": 1393, "page_last": 1400, "abstract": null, "full_text": "     Heuristics for Ordering Cue Search in \n                               Decision Making \n\n\n\n                      Peter M. Todd                                          Anja Dieckmann \n                        Center for Adaptive Behavior and Cognition \n                                 MPI for Human Development \n                             Lentzeallee 94, 14195 Berlin, Germany\n            ptodd@mpib-berlin.mpg.de                     dieckmann@mpib-berlin.mpg.de                  \n\n\n\n                                             Abstract\n\n          Simple lexicographic decision heuristics that consider cues one at a \n          time  in  a  particular  order  and  stop  searching  for  cues  as  soon  as  a \n          decision  can  be  made  have  been  shown  to  be  both  accurate  and \n          frugal  in  their  use  of  information.    But  much  of  the  simplicity  and \n          success  of  these  heuristics  comes  from  using  an  appropriate  cue \n          order.  For instance, the Take The Best heuristic uses validity order \n          for  cues,  which  requires  considerable  computation,  potentially \n          undermining  the  computational  advantages  of  the  simple  decision \n          mechanism.    But  many  cue  orders  can  achieve  good  decision \n          performance, and studies of sequential search for data records have \n          proposed  a  number  of  simple  ordering  rules  that  may  be  of  use  in \n          constructing  appropriate  decision  cue  orders  as  well.    Here  we \n          consider  a  range  of  simple  cue  ordering  mechanisms,  including \n          tallying, swapping, and move-to-front rules, and show that they can \n          find  cue  orders  that  lead  to  reasonable  accuracy  and  considerable \n          frugality when used with lexicographic decision heuristics. \n\n\n1    O n e - R e a s o n   D e c i s i o n   M a k i n g   a n d   O r d e r e d   S e a r c h    \n\nHow  do  we  know  what  information  to  consider  when  making  a  decision?    Imagine \nthe  problem  of  deciding  which  of  two  objects  or  options  is  greater  along  some \ncriterion,  such  as  which  of  two  cities  is  larger.    We  may  know  various  facts  about \neach city, such as whether they have a  major sports team or a university or airport.  \nTo  decide  between  them,  we  could  weight  and  sum  all  the  cues  we  know,  or  we \ncould  use  a  simpler  lexicographic  rule  to  look  at  one  cue  at  a  time  in  a  particular \norder  until  we  find  a  cue  that  discriminates  between  the  options  and  indicates  a \nchoice  [1].    Such  lexicographic  rules  are  used  by  people  in  a  variety  of  decision \ntasks [2]-[4], and have been shown to be both accurate in their inferences and frugal \nin the amount of information they consider before making a decision.  For instance, \nGigerenzer  and  colleagues  [5]  demonstrated  the  surprising  performance  of  several \ndecision heuristics that stop information search as soon as one discriminating cue is \nfound;  because  only  that  cue  is  used  to  make  the  decision,  and  no  integration  of \ninformation  is  involved,  they  called  these  heuristics  one-reason  decision \nmechanisms.    Given  some  set  of  cues  that  can  be  looked  up  to  make  the  decision, \nthese  heuristics  differ  mainly  in  the  search  rule  that  determines  the  order  in  which \n\n\f\nthe  information  is  searched.    But  then  the  question  of what  information  to  consider \nbecomes, how are these search orders determined? \n\nParticular cue orders make a difference, as has been shown in research on the Take \nThe Best heuristic (TTB) [6], [7].  TTB consists of three building blocks. (1) Search \nrule: Search through cues in the order of their validity, a measure of accuracy equal \nto  the  proportion  of  correct  decisions  made  by  a  cue  out  of  all  the  times  that  cue \ndiscriminates  between  pairs  of  options.  (2)  Stopping  rule:  Stop  search  as  soon  as \none  cue  is  found  that  discriminates  between  the  two  options.  (3)  Decision  rule: \nSelect the option to which the discriminating cue points, that is, the option that has \nthe cue value associated with higher criterion values.  \n\nThe  performance  of  TTB  has  been  tested  on  several  real-world  data  sets,  ranging \nfrom  professors  salaries  to  fish  fertility  [8],  in  cross-validation  comparisons  with \nother  more  complex  strategies.    Across  20  data  sets,  TTB  used  on  average  only  a \nthird  of  the  available  cues  (2.4  out  of  7.7),  yet  still  outperformed  multiple  linear \nregression in generalization accuracy (71% vs. 68%).  The even simpler Minimalist \nheuristic, which searches through available cues in a random order, was more frugal \n(using  2.2  cues  on  average),  yet  still  achieved  65%  accuracy.    But  the  fact  that  the \naccuracy  of  Minimalist  lagged  behind  TTB  by  6  percentage  points  indicates  that \npart  of  the  secret  of  TTBs  success  lies  in  its  ordered  search.    Moreover,  in \nlaboratory  experiments  [3],  [4],  [9],  people  using  lexicographic  decision  strategies \nhave  been  shown  to  employ  cue  orders  based  on  the  cues  validities  or  a \ncombination  of  validity  and  discrimination  rate  (proportion  of  decision  pairs  on \nwhich a cue discriminates between the two options). \n\nThus,  the  cue  order  used  by  a  lexicographic  decision  mechanism  can  make  a \nconsiderable  difference  in  accuracy;  the  same  holds  true  for  frugality,  as  we  will \nsee.    But  constructing  an  exact  validity  order,  as  used  by  Take  The  Best,  takes \nconsiderable  information  and  computation  [10].    If  there  are  N  known  objects  to \nmake  decisions  over,  and  C  cues  known  for  each  object,  then  each  of  the  C  cues \nmust  be  evaluated  for  whether  it  discriminates  correctly  (counting  up  R  right \ndecisions), incorrectly (W wrong decisions), or does not discriminate between each \nof  the  N (N-1)/2  possible  object  pairs,  yielding  C N (N-1)/2  checks  to  perform  to \ngather  the  information  needed  to  compute  cue  validities  (v  =  R/(R+W))  in  this \ndomain.    But  a  decision  maker  typically  does  not  know  all  of  the  objects  to  be \ndecided upon, nor even all the cue values for those objects, ahead of timeis there \nany simpler way to find an accurate and frugal cue order? \n\nIn  this  paper,  we  address  this  question  through  simulation-based  comparison  of  a \nvariety  of  simple  cue-order-learning  rules.    Hope  comes  from  two  directions:  first, \nthere  are  many  cue  orders  besides  the  exact  validity  ordering  that  can  yield  good \nperformance;  and  second,  research  in  computer  science  has  demonstrated  the \nefficacy  of  a  range  of  simple  ordering  rules  for  a  closely  related  search  problem.  \nConsequently,  we  find  that  simple  mechanisms  at  the  cue-order-learning  stage  can \nenable  simple  mechanisms  at  the  decision  stage,  such  as  lexicographic  one-reason \ndecision heuristics, to perform well. \n\n\n2    S i m p l e   a p p r o a c h e s   t o   c o n s t r u c t i n g   c u e   s e a r c h   o r d e r s  \n\nTo  compare  different  cue  ordering  rules,  we  evaluate  the  performance  of  different  cue \norders  when  used  by  a  one-reason  decision  heuristic  within  a  particular  well-studied \nsample domain: large German cities, compared on the criterion of population size using 9 \ncues  ranging  from  having  a  university  to  the  presence  of  an  intercity  train  line  [6],  [7].  \nExamining  this  domain  makes  it  clear  that  there  are  many  good  possible  cue  orders.  \nWhen used with one-reason stopping and decision building blocks, the mean accuracy of \nthe  362,880  (9!)  cue  orders  is  70%,  equivalent  to  the  performance  expected  from \n\n\f\nMinimalist.  The accuracy of the validity order, 74.2%, falls toward the upper end of the \naccuracy  range  (62-75.8%),  but  there  are  still  7421  cue  orders  that  do  better  than  the \nvalidity order.  The frugality of the search orders ranges from 2.53 cues per decision to \n4.67, with a mean of 3.34 corresponding to using Minimalist; TTB has a frugality of 4.23, \nimplying that most orders are more frugal.  Thus, there are many accurate and frugal cue \norders  that  could  be  founda  satisficing  decision  maker  not  requiring  optimal \nperformance need only land on one. \n\nAn  ordering  problem  of  this  kind  has  been  studied  in  computer  science  for  nearly  four \ndecades, and can provide us with a set of potential heuristics to test.  Consider the case of \na  set  of  data  records  arranged  in  a  list,  each  of  which  will  be  required  during  a  set  of \nretrievals with a particular probability pi. On each retrieval, a key is given (e.g. a records \ntitle) and the list is searched from the front to the end until the desired record, matching \nthat  key,  is  found.    The  goal  is  to  minimize  the  mean  search  time  for  accessing  the \nrecords  in  this  list,  for  which  the  optimal  ordering  is  in  decreasing  order  of  pi.    But  if \nthese retrieval probabilities are not known ahead of time, how can the list be ordered after \neach  successive  retrieval  to  achieve  fast  access?    This  is  the  problem  of  self-organizing \nsequential search [11], [12]. \n\nA  variety  of  simple  sequential  search  heuristics  have  been  proposed  for  this  problem, \ncentering on three main approaches: (1) transpose, in which a retrieved record is moved \none position closer to the front of the list (i.e., swapping with the record in front of it); (2) \nmove-to-front  (MTF),  in  which  a  retrieved  record  is  put  at  the  front  of  the  list,  and  all \nother records remain in the same relative order; and (3) count, in which a tally is kept of \nthe number of times each record is retrieved, and the list is reordered in decreasing order \nof  this  tally  after  each  retrieval.    Because  count  rules  require  storing  additional \ninformation,  more  attention  has  focused  on  the  memory-free  transposition  and  MTF \nrules.    Analytic  and  simulation  results  (reviewed  in  [12])  have  shown  that  while \ntransposition rules can come closer to the optimal order asymptotically, in the short run \nMTF rules converge more quickly (as can count rules).  This may make MTF (and count) \nrules more appealing as models of cue order learning by humans facing small numbers of \ndecision  trials.    Furthermore,  MTF  rules  are  more  responsive  to  local  structure  in  the \nenvironment (e.g., clumped retrievals over time of a few records), and transposition can \nresult in very poor performance under some circumstances (e.g., when neighboring pairs \nof popular records get trapped at the end of the list by repeatedly swapping places). \n\nIt  is  important  to  note  that  there  are  important  differences  between  the  self-\norganizing  sequential  search  problem  and  the  cue-ordering  problem  we  address \nhere.    In  particular,  when  a  record  is  sought  that  matches  a  particular  key,  search \nproceeds  until  the  correct  record  is  found.    In  contrast,  when  a  decision  is  made \nlexicographically  and  the  list  of  cues  is searched  through,  there  is  no  one  correct \ncue  to  findeach  cue  may  or  may  not  discriminate  (allow  a  decision  to  be  made).  \nFurthermore,  once  a  discriminating  cue  is  found,  it  may  not  even  make  the  right \ndecision.    Thus,  given  feedback  about  whether  a  decision  was  right  or  wrong,  a \ndiscriminating cue could potentially be moved up or down in the ordered list.  This \ndissociation  between  making  a  decision  or  not  (based  on  the  cue  discrimination \nrates),  and  making  a  right  or  wrong  decision  (based  on  the  cue  validities),  means \nthat  there  are  two  ordering  criteria  in  this  problemfrugality  and  accuracyas \nopposed  to  the  single  ordersearch  timefor  records  based  on  their  retrieval \nprobability  pi.    Because  record  search  time  corresponds  to  cue  frugality,  the \nheuristics  that  work  well  for  the  self-organizing  sequential  search  task  are  likely  to \nproduce  orders  that  emphasize  frugality  (reflecting  cue  discrimination  rates)  over \naccuracy  in  the  cue-ordering  task.    Nonetheless,  these  heuristics  offer  a  useful \nstarting point for exploring cue-ordering rules. \n\n\f\n2 . 1    T h e   c u e - o r d e r i n g   r u l e s  \n\nWe  focus  on  search  order  construction  processes  that  are  psychologically  plausible  by \nbeing  frugal  both  in  terms  of  information  storage  and  in  terms  of  computation.    The \ndecision  situation  we  explore  is  different  from  the  one  assumed  by  Juslin  and  Persson \n[10] who strongly differentiate learning about objects from later making decisions about \nthem.  Instead we assume a learning-while-doing situation, consisting of tasks that have \nto be done repeatedly with feedback after each trial about the adequacy of ones decision.\nFor instance, we can observe on multiple occasions which of two supermarket checkout \nlines,  the  one  we  have  chosen  or  (more  likely)  another  one,  is  faster,  and  associate  this \noutcome with cues including the lines lengths and the ages of their respective cashiers.\nIn such situations, decision makers can learn about the differential usefulness of cues for \nsolving the task via the feedback received over time.  \n\nWe  compare  several  explicitly  defined  ordering  rules  that  construct  cue  orders  for \nuse  by  lexicographic  decision  mechanisms  applied  to  a  particular  probabilistic \ninference  task:  forced  choice  paired  comparison,  in  which  a  decision  maker  has  to \ninfer which of two objects, each described by a set of binary cues, is bigger on a \ncriterionjust the task for which TTB was formulated.  After an inference has been \nmade,  feedback  is  given  about  whether  a  decision  was  right  or  wrong.    Therefore, \nthe  order-learning  algorithm  has  information  about  which  cues  were  looked  up, \nwhether  a  cue  discriminated,  and  whether  a  discriminating  cue  led  to  the  right  or \nwrong  decision.    The  rules  we  propose  differ  in  which  pieces  of  information  they \nuse  and  how  they  use  them.    We  classify  the  learning  rules  based  on  their  memory \nrequirementhigh  versus  lowand  their  computational  requirements  in  terms  of \nfull or partial reordering (see Table 1). \n\n\n         Table 1: Learning rules classified by memory and computational requirements \n\nHigh memory load,                       High memory load,               Low memory load,     \ncomplete reordering                     local reordering                local reordering \n\nValidity: reorders cues                 Tally swap: moves               Simple swap: moves \n   based on their                           cue up (down) one             cue up one position \n   current validity                         position if it has            after correct decision, \n                                            made a correct                and down after an \nTally: reorders cues                        (incorrect) decision          incorrect decision \n   by number of                             if its tally of correct \n   correct minus                            minus incorrect             Move-to-front (2 forms): \n   incorrect decisions                      decisions is   ( )            Take The Last (TTL):\n   made so far                              than that of next             moves discriminating \nAssociative/delta rule:                     higher (lower) cue            cue to front \n   reorders cues by                                                       TTL-correct: moves \n   learned association                                                    cue to front only if it \n   strength                                                               correctly discriminates \n\n\n\n\nThe validity  rule,  a  type  of  count  rule,  is  the  most  demanding  of  the  rules  we \nconsider  in  terms  of  both  memory  requirements  and  computational  complexity.  It \nkeeps  a  count  of  all  discriminations  made  by  a  cue  so  far  (in  all  the  times  that  the \ncue  was  looked  up)  and  a  separate  count  of  all  the  correct  discriminations. \nTherefore,  memory  load  is  comparatively  high.  The  validity  of  each  cue  is \ndetermined  by  dividing  its  current  correct  discrimination  count  by  its  total \ndiscrimination  count.  Based  on  these  values  computed  after  each  decision,  the  rule \nreorders the whole set of cues from highest to lowest validity.  \n\n\f\nThe tally rule only keeps one count per cue, storing the number of correct decisions \nmade  by  that  cue  so  far  minus  the  number  of  incorrect  decisions.    If  a  cue \ndiscriminates  correctly  on  a  given  trial,  one  point  is  added  to  its  tally,  if  it  leads  to \nan  incorrect  decision,  one  point  is  subtracted.    The  tally  rule  is  less  demanding  in \nterms of memory and computation: Only one count is kept, no division is required.  \n\nThe simple  swap  rule  uses  the  transposition  rather  than  count  approach.    This  rule \nhas  no  memory  of  cue  performance  other  than  an  ordered  list  of  all  cues,  and  just \nmoves a cue up one position in this list whenever it leads to a correct decision, and \ndown  if  it  leads  to  an  incorrect  decision.  In  other  words,  a  correctly  deciding  cue \nswaps  positions  with  its  nearest  neighbor  upwards  in  the  cue  order,  and  an \nincorrectly deciding cue swaps positions with its nearest neighbor downwards.\n\nThe tally swap rule is a hybrid of the simple swap rule and the tally rule. It keeps a \ntally  of  correct  minus  incorrect  discriminations  per  cue  so  far  (so  memory  load  is \nhigh) but only locally swaps cues: When a cue makes a correct decision and its tally \nis greater than or equal to that of its upward neighbor, the two cues swap positions. \nWhen a cue makes an incorrect decision and its tally is smaller than or equal to that \nof its downward neighbor, the two cues also swap positions. \n\nWe  also  evaluate  two  types  of  move-to-front  rules.    First,  the  Take  The  Last (TTL) \nrule  moves  the  last  discriminating  cue  (that  is,  whichever  cue  was  found  to \ndiscriminate for the current decision) to the front of the order.  This is equivalent to \nthe Take The Last heuristic [6], [7], which uses a memory of cues that discriminated \nin  the  past  to  determine  cue  search  order  for  subsequent  decisions.    Second,  TTL-\ncorrect moves the last discriminating cue to the front of the order only if it correctly \ndiscriminated;  otherwise,  the  cue  order  remains  unchanged.    This  rule  thus  takes \naccuracy as well as frugality into account. \n\nFinally, we include an associative learning rule that uses the delta rule to update cue \nweights  according  to  whether  they  make  correct  or  incorrect  discriminations,  and \nthen  reorders  all  cues  in  decreasing  order  of  this  weight  after  each  decision.    This \ncorresponds to a simple network with nine input units encoding the difference in cue \nvalue  between  the  two  objects  (A  and  B)  being  decided  on  (i.e.,  ini  =  -1  if \ncuei(A)<cuei(B),  1  if  cuei(A)>cuei(B),  and  0  if  cuei(A)=cuei(B)  or  cuei  was  not \nchecked) and with one output unit whose target value encodes the correct decision (t\n= 1 if criterion(A)>criterion(B), otherwise -1), and with the weights between inputs \nand output updated according to  wi = lr   (t - ini wi)   ini with learning rate lr = 0.1.  \nWe  expect  this  rule  to  behave  similarly  to  Olivers  rule  initially  (moving  a  cue  to \nthe  front  of  the  list  by  giving  it  the  largest  weight  when  weights  are  small)  and  to \nswap later on (moving cues only a short distance once weights are larger). \n\n\n3    S i m u l a t i o n   S t u d y   o f   S i m p l e   O r d e r i n g   R u l e s  \n\nTo test the performance of these order learning rules, we use the German cities data set \n[6],  [7],  consisting  of  the  83  largest-population  German  cities  (those  with  more  than \n100,000  inhabitants),  described  on  9  cues  that  give  some  information  about  population \nsize.  Discrimination rate and validity of the cues are negatively correlated (r = -.47). We \npresent  results  averaged  over  10,000  learning  trials  for  each  rule,  starting  from  random \ninitial  cue  orders.    Each  trial  consisted  of  100  decisions  between  randomly  selected \ndecision pairs.  For each decision, the current cue order was used to look up cues until a \ndiscriminating  cue  was  found,  which  was  used  to  make  the  decision  (employing  a  one-\nreason  or  lexicographic  decision  strategy).    After  each  decision,  the  cue  order  was \nupdated using the particular order-learning rule. We start by considering the cumulative \naccuracies (i.e., online or amortized performance[12]) of the rules, defined as the total \npercentage  of  correct  decisions  made  so  far  at  any  point  in  the  learning  process.  The \n\n\f\ncontrasting  measure  of offline  accuracyhow  well  the  current  learned  cue order  would \ndo if it were applied to the entire test setwill be subsequently reported (see Figure 1).  \n\nFor  all  but  the  move-to-front  rules,  cumulative  accuracies  soon  rise  above  that  of  the \nMinimalist heuristic (proportion correct = .70) which looks up cues in random order and \nthus serves  as a  lower benchmark. However,  at  least  throughout  the  first  100 decisions, \ncumulative accuracies stay well below the (offline) accuracy that would be achieved by \nusing TTB for all decisions (proportion correct = .74), looking up cues in the true order of \ntheir  ecological  validities.  Except  for  the  move-to-front  rules,  whose  cumulative \naccuracies are very close to Minimalist (mean proportion correct in 100 decisions: TTL: \n.701; TTL-correct: .704), all learning rules perform on a surprisingly similar level, with \nless than one percentage point difference in favor of the most demanding rule (i.e., delta \nrule:  .719)  compared  to  the  least  (i.e.,  simple  swap:  .711;  for  comparison:  tally  swap: \n.715; tally: .716; validity learning rule: .719). Offline accuracies are slightly higher, again \nwith  the  exception  of  the  move  to  front  rules  (TTL:  .699;  TTL-correct:  .702;  simple \nswap: .714; tally swap: .719; tally: .721; validity learning rule: .724; delta rule: .725; see \nFigure 1). In longer runs (10,000 decisions) the validity learning rule is able to converge \non TTBs accuracy, but the tally rules performance changes little (to .73). \n\n\n\n\n\nFigure 1: Mean offline accuracy of                     Figure 2: Mean offline frugality of \norder learning rules                                    order learning rules \n\n\nAll  learning  rules  are,  however,  more  frugal  than  TTB,  and  even  more  frugal  than \nMinimalist,  both  in  terms  of  online  as  well  as  offline  frugality.  Let  us  focus  on  their \noffline frugality (see Figure 2): On average, the rules look up fewer cues than Minimalist \nbefore  reaching  a  decision.  There  is  little  difference  between  the  associative  rule,  the \ntallying rules and the swapping rules (mean number of cues looked up in 100 decisions: \ndelta  rule:  3.20;  validity  learning  rule:  3.21;  tally:  3.01;  tally  swap:  3.04;  simple  swap: \n3.13). Most frugal are the two move-to front rules (TTL-correct: 2.87; TTL: 2.83).  \n\nConsistent with this finding, all of the learning rules lead to cue orders that show positive \ncorrelations  with  the  discrimination  rate  cue  order  (reaching  the  following  values  after \n100  decisions:  validity  learning  rule:  r  =  .18;  tally:  r  =  .29;  tally  swap:  r  =  .24;  simple \nswap: r = .18; TTL-correct: r = .48; TTL: r = .56). This means that cues that often lead to \ndiscriminations  are  more  likely  to  end  up  in  the  first  positions  of  the  order.  This  is \nespecially  true  for  the  move-to-front  rules. In  contrast,  the  cue  orders  resulting from  all \nlearning rules but the validity learning rule do not correlate or correlate negatively with \nthe  validity  cue  order,  and  even  the  correlations  of  the  cue  orders  resulting  from  the \nvalidity learning rule after 100 decisions only reach an average r = .12.  \n\nBut  why  would  the  discrimination  rates  of  cues  exert  more  of  a  pull  on  cue  order  than \nvalidity,  even  when  the  validity  learning  rule  is  applied?  As  mentioned  earlier,  this  is \nwhat  we  would  expect  for  the  move-to-front  rules,  but  it  was  unexpected  for  the  other \nrules. Part of the explanation comes from the fact that in the city data set we used for the \n\n\f\nsimulations, validity  and discrimination rate  of  cues  are negatively  correlated. Having a \nlow  discrimination  rate  means  that  a  cue  has  little  chance  to  be  used  and  hence  to \ndemonstrate  its  high  validity.  Whatever  learning  rule  is  used,  if  such  a  cue  is  displaced \ndownward to the lower end of the order by other cues, it may have few chances to escape \nto the higher ranks where it belongs. The problem is that when a decision pair is finally \nencountered  for  which  that  cue  would  lead  to  a  correct  decision,  it  is  unlikely  to  be \nchecked  because  other,  more  discriminating  although  less  valid,  cues  are  looked  up \nbefore and already bring about a decision. Thus, because one-reason decision making is \nintertwined  with  the  learning  mechanism  and  so  influences  which  cues  can  be  learned \nabout, what mainly makes a cue come early in the order is producing a high number of \ncorrect  decisions  and  not  so  much  a  high  ratio  of  correct  discriminations  to  total \ndiscriminations regardless of base rates. \n\nThis  argument  indicates  that  performance  may  differ  in  environments  where  cue \nvalidities  and  discrimination  rates  correlate  positively.  We  tested  the  learning  rules  on \none such data set (r=.52) of mammal species life expectancies, predicted from 9 cues. It \nalso  differs  from  the  cities  environment  with  a  greater  difference  between  TTBs  and \nMinimalists performance (6.5 vs. 4 percentage points). In terms of offline accuracy, the \nvalidity  learning  rule  now  indeed  more  closely  approaches  TTBs  accuracy  after  100 \ndecisions (.773 vs. .782)., The tally rule, in contrast, behaves very much as in the cities \nenvironment,  reaching  an  accuracy  of  .752,  halfway  between  TTB  and  Minimalist \n(accuracy =.716). Thus only some learning rules can profit from the positive correlation. \n\n4    D i s c u s s i o n  \n\nMost  of  the  simpler  cue  order  learning  rules  we  have  proposed  do  not  fall  far  behind  a \nvalidity  learning  rule  in  accuracy,  and  although  the  move-to-front  rules  cannot  beat  the \naccuracy  achieved  if  cues  were  selected  randomly,  they  compensate  for  this  failure  by \nbeing  highly  frugal.  Interestingly,  the  rules  that  do  achieve  higher  accuracy  than \nMinimalist also beat random cue selection in terms of frugality.  \n\nOn the other hand, all rules, even the delta rule and the validity learning rule, stay below \nTTBs accuracy across a relatively high number of decisions. But often it is necessary to \nmake  good  decisions  without  much  experience.  Therefore,  learning  rules  should  be \npreferred that quickly lead to orders with good performance. The relatively complex rules \nwith relatively high memory requirement, i.e., the delta and the validity learning rule, but \nalso the tally learning rule, more quickly rise in accuracy compared the rules with lower \nrequirements. Especially the tally rule thus represents a good compromise between cost, \ncorrectness and psychological plausibility considerations. \n\nRemember  that  the  rules  based  on  tallies  assume  full  memory  of  all  correct  minus \nincorrect decisions made by a cue so far. But this does not make the rule implausible, at \nleast from a psychological perspective, even though computer scientists were reluctant to \nadopt  such  counting  approaches  because  of  their  extra  memory  requirements.  There  is \nconsiderable evidence that people are actually very good at remembering the frequencies \nof events. Hasher and Zacks [13] conclude from a wide range of studies that frequencies \nare  encoded  in  an  automatic  way,  implying  that  people  are  sensitive  to  this  information \nwithout  intention  or  special  effort.  Estes  [14]  pointed  out  the  role  frequencies  play  in \ndecision making as a shortcut for probabilities. Further, the tally rule and the tally swap \nrule are comparatively simple, not having to keep track of base rates or perform divisions \nas  does  the  validity  rule.  From  the  other  side,  the  simple  swap  and  move  to  front  rules \nmay  not  be  much  simpler,  because  storing  a  cue  order  may  be  about  as  demanding  as \nstoring a set of tallies. We have run experiments (reported elsewhere) in which indeed the \ntally swap rule best accounts for peoples actual processes of ordering cues. \n\nOur goal in this paper was to explore how well simple cue-ordering rules could work in \nconjunction  with  lexicographic  decision  strategies.  This  is  important  because  it  is \n\n\f\nnecessary to take into account the set-up costs of a heuristic in addition to its application \ncosts  when  considering  the  mechanisms  overall  simplicity.  As  the  example  of  the \nvalidity search order of TTB shows, what is easy to apply may not necessarily be so easy \nto set up. But simple rules can also be at work in the construction of a heuristics building \nblocks.  We  have  proposed  such  rules  for  the  construction  of  one  building  block,  the \nsearch order. Simple learning rules inspired by research in computer science can enable a \none-reason decision heuristic to perform only slightly worse than if it had full knowledge \nof  cue  validities  from  the  very  beginning.  Giving  up  the  assumption  of  full  a  priori \nknowledge for the slight decrease in accuracy seems like a reasonable bargain: Through \nthe  addition  of  learning  rules,  one-reason  decision  heuristics  might  lose  some  of  their \nappeal  to  decision  theorists  who  were  surprised  by  the  performance  of  such  simple \nmechanisms  compared  to  more  complex  algorithms,  but  they  gain  psychological \nplausibility and so become more attractive as explanations for human decision behavior.  \n\n\nR e f e r e n c e s  \n\n[1]  Fishburn,  P.C.  (1974).    Lexicographic  orders,  utilities  and  decision  rules:  A  survey.  \nManagement Science, 20, 1442-1471. \n\n[2] Payne, J.W., Bettman, J.R., & Johnson, E.J. (1993).  The adaptive decision maker.  New York: \nCambridge University Press. \n\n[3] Br^der, A. (2000). Assessing the empirical validity of the Take-The-Best heuristic as a model \nof  human  probabilistic  inference.  Journal  of  Experimental  Psychology:  Learning,  Memory,  and \nCognition, 26 (5), 1332-1346. \n\n[4]  Br^der,  A.  (2003).  Decision  making  with  the  adaptive  toolbox:  Influence  of  environmental \nstructure, intelligence, and working memory load. Journal of Experimental Psychology: Learning, \nMemory, & Cognition, 29, 611-625.\n\n[5] Gigerenzer, G., Todd, P.M., & The ABC Research Group (1999). Simple heuristics that make \nus smart. New York: Oxford University Press. \n\n[6]  Gigerenzer,  G.,  &  Goldstein,  D.G.  (1996).  Reasoning  the  fast  and  frugal  way:  Models  of \nbounded rationality. Psychological Review, 103 (4), 650-669.\n\n[7]  Gigerenzer,  G.,  &  Goldstein,  D.G.  (1999).  Betting  on  one  good  reason:  The  Take  The  Best \nHeuristic. In G. Gigerenzer, P.M. Todd & The ABC Research Group, Simple heuristics that make \nus smart. New York: Oxford University Press. \n\n[8] Czerlinski, J., Gigerenzer, G., & Goldstein, D.G. (1999). How good are simple heuristics? In G. \nGigerenzer,  P.M.  Todd  &  The  ABC  Research  Group,  Simple  heuristics  that  make  us  smart.  New \nYork: Oxford University Press. \n\n[9]  Newell,  B.R.,  &  Shanks,  D.R.  (2003).  Take  the  best  or  look  at  the  rest?  Factors  influencing \none-reason  decision  making.  Journal  of  Experimental  Psychology:  Learning,  Memory,  and \nCognition, 29, 53-65. \n\n[10]  Juslin,  P.,  &  Persson,  M.  (2002).  PROBabilities  from  EXemplars  (PROBEX):  a  lazy \nalgorithm for probabilistic inference from generic knowledge. Cognitive Science, 26, 563-607. \n\n[11]  Rivest,  R.  (1976).  On  self-organizing  sequential  search  heuristics.    Communications  of  the \nACM, 19(2), 63-67. \n\n[12]  Bentley,  J.L.  &  McGeoch,  C.C.  (1985).  Amortized  analyses  of  self-organizing  sequential \nsearch heuristics.  Communications of the ACM, 28(4), 404-411. \n\n[13] Hasher, L., & Zacks, R.T. (1984). Automatic Processing of fundamental information: The case \nof frequency of occurrence. American Psychologist, 39, 1372-1388. \n\n[14] Estes, W.K. (1976). The cognitive side of probability learning. Psychological Review, 83, 37-\n64.\n\n\f\n", "award": [], "sourceid": 2635, "authors": [{"given_name": "Peter", "family_name": "Todd", "institution": null}, {"given_name": "Anja", "family_name": "Dieckmann", "institution": null}]}