{"title": "Strategic Impatience in Go/NoGo versus Forced-Choice Decision-Making", "book": "Advances in Neural Information Processing Systems", "page_first": 2123, "page_last": 2131, "abstract": "Two-alternative forced choice (2AFC) and Go/NoGo (GNG) tasks are behavioral choice paradigms commonly used to study sensory and cognitive processing in choice behavior. While GNG is thought to isolate the sensory/decisional component by removing the need for response selection, a consistent bias towards the Go response (higher hits and false alarm rates) in the GNG task suggests possible fundamental differences in the sensory or cognitive processes engaged in the two tasks. Existing mechanistic models of these choice tasks, mostly variants of the drift-diffusion model (DDM; [1,2]) and the related leaky competing accumulator models [3,4] capture various aspects of behavior but do not address the provenance of the Go bias.  We postulate that this ``impatience'' to go is a strategic adjustment in response to the implicit asymmetry in the cost structure of GNG: the NoGo response requires waiting until the response deadline, while a Go response immediately terminates the current trial. We show that a Bayes-risk minimizing decision policy that minimizes both error rate and average decision delay naturally exhibits the experimentally observed bias.  The optimal decision policy is formally equivalent to a DDM with a time-varying threshold that initially rises after stimulus onset, and collapses again near the response deadline. The initial rise is due to the fading temporal advantage of choosing the Go response over the fixed-delay NoGo response. We show that fitting a simpler, fixed-threshold DDM to the optimal model reproduces the counterintuitive result of a higher threshold in GNG than 2AFC decision-making, previously observed in direct DDM fit to behavioral data [2], although such approximations cannot reproduce the Go bias. Thus, observed discrepancies between GNG and 2AFC decision-making may arise from rational strategic adjustments to the cost structure, and need not imply additional differences in the underlying sensory and cognitive processes.", "full_text": "Strategic Impatience in Go/NoGo versus\n\nForced-Choice Decision-Making\n\nPradeep Shenoy\n\nCognitive Science Department\n\nUniversity of California, San Diego\n\nLa Jolla, CA, 92093\n\npshenoy@ucsd.edu\n\nAngela J. Yu\n\nCognitive Science Department\n\nUniversity of California, San Diego\n\nLa Jolla, CA, 92093\najyu@ucsd.edu\n\nAbstract\n\nTwo-alternative forced choice (2AFC) and Go/NoGo (GNG) tasks are behavioral\nchoice paradigms commonly used to study sensory and cognitive processing in\nchoice behavior. While GNG is thought to isolate the sensory/decisional compo-\nnent by eliminating the need for response selection as in 2AFC, a consistent ten-\ndency for subjects to make more Go responses (both higher hits and false alarm\nrates) in the GNG task raises the concern that there may be fundamental differ-\nences in the sensory or cognitive processes engaged in the two tasks. Existing\nmechanistic models of these choice tasks, mostly variants of the drift-diffusion\nmodel (DDM; [1, 2]) and the related leaky competing accumulator models [3, 4],\ncapture various aspects of behavioral performance, but do not clarify the prove-\nnance of the Go bias in GNG. We postulate that this \u201cimpatience\u201d to go is a strate-\ngic adjustment in response to the implicit asymmetry in the cost structure of the\n2AFC and GNG tasks: the NoGo response requires waiting until the response\ndeadline, while a Go response immediately terminates the current trial. We show\nthat a Bayes-risk minimizing decision policy that minimizes not only error rate\nbut also average decision delay naturally exhibits the experimentally observed Go\nbias. The optimal decision policy is formally equivalent to a DDM with a time-\nvarying threshold that initially rises after stimulus onset, and collapses again just\nbefore the response deadline. The initial rise in the threshold is due to the dimin-\nishing temporal advantage of choosing the fast Go response compared to the \ufb01xed-\ndelay NoGo response. We also show that \ufb01tting a simpler, \ufb01xed-threshold DDM\nto the optimal model reproduces the counterintuitive result of a higher threshold in\nGNG than 2AFC decision-making, previously observed in direct DDM \ufb01t to be-\nhavioral data [2], although such \ufb01xed-threshold approximations cannot reproduce\nthe Go bias. Our results suggest that observed discrepancies between GNG and\n2AFC decision-making may arise from rational strategic adjustments to the cost\nstructure, and thus need not imply any other difference in the underlying sensory\nand cognitive processes.\n\n1\n\nIntroduction\n\nThe two-alternative forced-choice (2AFC) task is a standard experimental paradigm used in psy-\nchology and neuroscience to investigate various aspects of sensory, motor, and cognitive processing\n[5]. Typically, the paradigm involves a forced choice between two responses based on a presented\nstimulus, with the measured response time and accuracy of choices shedding light on the cognitive\nand neural processes underlying behavior. Another paradigm that appears to share many features\nof the 2AFC task is the Go/NoGo (GNG) task [6], (see Luce [5] for a review), where one stimulus\ncategory is associated with an overt Go response that has to be executed before a response dead-\n\n1\n\n\fline, and the other stimulus (NoGo) requires withholding response until the response deadline has\nelapsed. In principle, the GNG task could be used to probe the same decision-making problems as\nthe 2AFC task, with the possible advantage of eliminating a \u201cresponse selection stage\u201d that may\nfollow the decision in the 2AFC task [6, 7]. Indeed, the GNG task has been used to study various\naspects of human and animal cognition, e.g., lexical judgements [8, 9], perceptual decision-making\n[10, 11, 12], and the neural basis of choice behavior (in particular, distinguishing among neural\nactivations associated with stimulus, memory, and response) [13, 14, 15]. However, experimental\nevidence also indicates that there is a curious choice bias toward the overt (Go) response in the GNG\ntask [11, 16, 2, 15], in the form of shorter response times and more false alarms for the Go response,\nthan when compared to the same stimulus pairings in a 2AFC task [2, 16]. It has been suggested that\nthis choice bias may re\ufb02ect differential sensory and cognitive processes underlying the two tasks,\nand thus making the two non-interchangeable in the study of perception and decision-making.\nIn this paper, we hypothesize that this discrepancy may simply be due to differences in the implicit\nreward (cost) structure of the two tasks: the NoGo response incurs a higher imposed waiting cost\nthan the Go response, since the NoGo response must wait until the response deadline has passed to\nregister, while a Go response immediately terminates the trial. In contrast, in the 2AFC task, the cost\nfunction is symmetric for the two alternatives, whether in terms of error or delay. We propose that\nthe implicit cost structure difference in GNG can fully account for the Go bias in GNG compared\nto 2AFC tasks, without the need to appeal to other differences in sensory or cognitive processing.\nTo investigate this hypothesis, we adopt a Bayes risk minimization framework for both the 2AFC\nand GNG tasks, whereby sensory processing is modeled as iterative Bayesian inference of stimulus\ntype based on a stream of noisy sensory input, and the decision of when/how to respond rests on\na policy that minimizes a linear combination of expected decision delay and response errors. The\noptimal decision policy for this Bayes-risk formulation in the 2AFC task is known as the sequential\nprobability ratio test (SPRT; [17, 18]), and has been shown to account for both behavioral [19, 4]\nand neural data [19, 20]. Here, we generalize this theoretical framework to account for both 2AFC\nand GNG decision-making in a uni\ufb01ed framework, by assuming that a subject\u2019s sensory and percep-\ntual processing (of the same pair of stimuli) and the relative preference for decision accuracy versus\nspeed are shared across 2AFC and GNG, with the only difference between them being the asym-\nmetric temporal cost implicit in the reward structure of the GNG task \u2013the Go response terminating\na trial while the NoGO response only registering after the response deadline.\nAs a stochastic process, SPRT is a bounded random walk, whereby the stochasticity in the random\nwalk comes from noise in the observation process. The continuum (time) limit of a bounded random\nwalk is the bounded drift-diffusion model (DDM), which generally assume a stochastic dynamic\nvariable to undergo constant drift, as well as diffusion due to Wiener noise, until one of two \ufb01nite\nthresholds is breached. In psychology, DDM has been augmented with additional parameters such as\na non-decision-related repsonse delay, variability in drift-rate, and variability in starting point across\ntrials. Figure 4A shows a simple variant of the DDM illustrating the following parameters: rate\nof accumulation, threshold, and \u201cnondecision time\u201d or temporal offset to the start of the diffusion\nprocess. These augmented DDMs have been used to model behavior in 2AFC tasks [21, 22, 23,\n5, 24, 4], and also appear to provide good descriptive accounts of the neural activities underlying\nperceptual decision-making [25, 20, 26, 27]. Variants of augmented DDM have also been utilized to\n\ufb01t data in other simple decision-making tasks, including the GNG task [2]. While augmenting DDM\nwith extra parameters gives it additional power in explaining subtleties in data, this also diminishes\nthe normative interpretability of DDM \ufb01ts by eliminating its formal relationship to the optimal SPRT\nprocedure. As a consequence, when the behavioral objectives change, e.g., in the GNG task, DDM\ncannot predict a priori what parameters ought to change and how much. Instead, we begin with a\nBayes-risk minimization formulation and derive the non-parametric optimal decision-procedure as\na function of sensory statistics and behavioral objectives. We then map the optimal policy to the\nDDM model space, and compare directly with previously proposed DDM variants in the context of\n2AFC and GNG tasks.\nIn the following sections, we \ufb01rst describe our proposed Bayesian inference and decision-making\nmodel, then compare simulations of the optimal decision-making model with published experimen-\ntal data of subjects performing perceptual decision-making in 2AFC and GNG tasks [16]. We also\nexplore other evidence exploring the degree of go bias in the GNG task [28]. Next, we consider\nthe formal relationship between the optimal model and a \ufb01xed-threshold DDM that was previously\nutilized to \ufb01t behavioral data from the GNG task [2, 12]. Finally, we present novel experimental\n\n2\n\n\fFigure 1: Systematic error biases in the GNG task. (A) The \ufb01gure shows error rates associated\nwith a perceptual decision-making task performed by subjects in both Go/NoGo and Yes/No (forced\nchoice) settings. Although the error rates in the forced choice settings were similar for both classes,\nthere was a signi\ufb01cant bias towards the Go response in the GNG task, with more false alarms than\nomission errors. (B) Mean response time on the GNG task was lower than for the same stimulus on\nthe 2AFC task. (Data adapted from Bacon-Mace et al., 2007).\n\npredictions of the optimal decision-making model, including those that speci\ufb01cally differ from the\n\ufb01xed-threshold DDM approximation [2, 12].\n\n2 Bayesian inference and risk minimization in choice tasks\n\nHuman choice behavior in the GNG and 2AFC tasks exhibits a consistent Go bias in the GNG task\nthat is not apparent for the same stimulus in the 2AFC task. For example, Figure 1 shows data\nfrom a task in which subjects must identify whether a brie\ufb02y-presented noisy image contains an\nanimal or not [16], under two different response conditions: GNG (only respond to animal-present\nimages), and 2AFC (respond yes/no to each image). Subjects showed a signi\ufb01cant bias towards the\nGo response in the GNG task, in the form of higher false alarms than omission errors (Figure 1A),\nas well as faster RT than for the same stimulus in the 2AFC task (Figure 1B).\nFor the 2AFC task, a large body of literature supports the \u201caccumulate-to-bound\u201d model of percep-\ntual decision-making, [23, 20, 26], where moment-to-moment sensory input (\u201cevidence\u201d in favor of\neither choice) is accumulated over time until it reaches a bound, at which point, a response is gen-\nerated. Previous work by Yu & Frazier [29] extended the formulation to include 2AFC tasks with a\ndecision deadline, in which subjects have the additional constraint of not exceeding a decision dead-\nline. They showed that the optimal policy for decision-making under a deadline is to accumulate\nevidence up to time-varying thresholds that collapse toward each other over time, leading to more\n\u201cliberal\u201d choices and higher error rate in later responses than earlier ones. Here, we generalize the\nframework to model the GNG task. In particular, the same deadline by which the subject must make\na response (or else be counted as a \u201cmiss\u201d) on a Go trial, is the one for which the subject must with-\nold response (or else be counted as a \u201cfalse alarm\u201d). We model evidence accumulation as iterative\nBayesian inference over the identity of the stimulus, and decision-making as an iterative decision\npolicy that chooses whether to respond (and which one in 2AFC) or continue observing at least one\nmore time point, based on current evidence. The optimal policy minimizes the expected value of a\ncost function that depends linearly on decision delay and errors. The model is described below.\n\n2.1 Evidence integration as Bayesian inference\n\nWe model evidence accumulation, in both 2AFC and GNG, as iterative Bayesian inference about the\nstimulus identity conditioned on an independent and identically distributed (i.i.d.) stream of sensory\ninput. Speci\ufb01cally, we assume a generative model where the observations are a continual sequence\nof data samples x1, x2, . . ., iid-generated from a likelihood function f0(x) or f1(x) depending on\nwhether the true stimulus state is d = 0 or d = 1, respectively. This incoming stream therefore\nprovides accumulating evidence of the hidden category label d \u2208 {0, 1}. For concreteness, we\nassume the likelihood functions are Gaussian distributions with means \u00b1\u00b5 (+ for d = 1, - for\nd = 0), and a variance parameter \u03c32 controlling the noisiness of the stimuli.\n\n3\n\nno/nogoyes/go00.050.10.15Error RateData: Error rate  2AFCGNG0200400RT (ms)Data: RT  left/nogoright/go00.050.10.150.2StimulusError RateModel: Error rate  2AFCGNG051015RT (steps)Model: RT  \fFigure 2: Rational behavior in 2AFC and GNG tasks. (A) The \ufb01gure shows the decision threshold\nas a function of belief state across the 2AFC and GNG tasks. The optimal decision boundary for\n2AFC is a pair of parallel thresholds (solid line) that collapse and meet at the response deadline (in-\ndicated by dashed vertical line). The optimal GNG decision boundary is a single initially increasing\nthreshold (dashed line), that decreases to 0.5 at the response deadline. (B;C) Monte Carlo simulation\nof the optimal policy show a bias towards the overt response in the GNG task. The two response\nalternatives in the 2AFC task are represented as \u201cleft\u201d and \u201cright\u201d, corresponding to \u201cnogo\u201d and\n\u201cgo\u201d in the GNG task (B). The GNG task shows lower miss rate and higher false alarm rate than the\ncorresponding 2AFC error rate (B), along with faster RT than the 2AFC task (C). Compare to the\nexperimental data in Figure 1. Parameter settings: c = 0.01, \u00b5 = 0.25, D = 40 timesteps.\n\nThe recognition model speci\ufb01es the mechanism by which stimulus identity is inferred from the\nnoisy observations xt. In our model, we compute an posterior distribution over the category label\nconditioned on the data sampled so far xt (cid:44) (x1, x2, . . . xt), bt (cid:44) P{d = 1|xt}, also known as the\nbelief state, by iteratively applying Bayes\u2019 rule:\n\nbt+1 =\n\nbtf1(xt+1)\n\nbtf1(xt+1) + (1 \u2212 bt)f0(xt+1)\n\n(1)\n\nwhere b0 (cid:44) P{d = 1} is the prior probability of the stimulus category being 1 (and is 0.5 for\nequally likely stimuli). We hypothesize that the same evidence accumulation mechanism underlies\ndecision-making in both tasks, in particular with the same noise process/likelihood functions, f0(x)\nand f1(x), for a particular individual observing the same stimuli.\n\n2.2 Action selection as Bayes-risk minimization\n\nWe model behavior in the two tasks as a sequential decision-making process where, at each instant,\nthe model decideses between two actions, as a function of the current evidence so far, encapsulated\nin the current belief state bt: stop (and choose the response for the more probable stimulus category\nfor 2AFC), or continue one more time step. A stopping policy is a mapping from the belief state\nto the action space, \u03c0 :bt (cid:55)\u2192 {stop, continue}, where the stop action in 2AFC also requires a\nstimulus category decision \u03b4. In accordance with the standard Bayes risk framework for optimizing\nthe decision policy in a stopping problem, we assume that the behavioral cost function is a linear\ncombination of the probability of making a decision error and the expected decision delay \u03c4 (the\nstopping time if a response is emitted before the deadline, and the deadline D otherwise). We\nassume that the decision delay component is weighted by a sampling or time cost c, while the cost\nof all decision errors are penalized by the same magnitude and normalized to unit cost. Based on\nthis cost function, the optimal decision policy is the policy that minimizes the overall expected cost:\n\n2AF C : L\u03c0 = c(cid:104)\u03c4(cid:105) + P{\u03b4 (cid:54)= d} + P{\u03c4 = D}\nGN G : L\u03c0 = c(cid:104)\u03c4(cid:105) + P{\u03c4 = D|d = 1}P{d = 1} + P{\u03c4 < D|d = 0}P{d = 0}\n\n(2)\n(3)\n\nThe 2AFC cost function is a special case of the more general scenario previously considered for\ndeadlined sequential hypothesis testing [29]: P{\u03b4 (cid:54)= d} is the expected wrong response cost, while\nP{\u03c4 = D} is the expected cost of not responding before the deadline (omission error). In the GNG\ncost function, P{\u03c4 = D|d = 1} is the probability that no response is emitted before the deadline\non a Go trial (miss), P{\u03c4 < D|d = 0} is the probability that a NoGo trial is terminated by a Go\n\n4\n\nno/nogoyes/go00.050.10.15Error RateData: Error rate  2AFCGNG0200400RT (ms)Data: RT  left/nogoright/go00.050.10.150.2StimulusError RateModel: Error rate  2AFCGNG051015RT (steps)Model: RT  0204000.51TimeBelief  Decision thresholdABModel: Error rateCModel: RT\fFigure 3: In\ufb02uence of stimulus statistics on Go bias. Our model predicts that alse alarms are more\nfrequent than misses (A), and are also faster than correct Go RTs (B). The Go bias, which is apparent\nat 50% Go trials, is sign\ufb01cantly increased when Go trials are more frequent (80%), and reduced\nwhen Go trials are reduced to 20% of the trials. Parameter settings: c = 0.014, \u00b5 = 0.45, D = 40\ntimesteps. (C-D) Human subjects exhibited a similar pattern of behavior in a letter discrimination\ntask (Data from Nieuwenhuis et al., 2003).\n\nresponse (false alarm), a correct hit requires \u03c4 < D (responding before the deadline), and a correct\nNoGo response consists of a series of continue actions until a prede\ufb01ned response deadline D. In\nboth GNG and 2AFC tasks, the choice to stop limits the decision delay cost, and the choice to\ncontinue (up to a prede\ufb01ned response deadline D) results in the collection of more data that help\nto disambiguate the stimulus category but at the cost of c per additional sample of data observed.\nWe compute the optimal policy using Bellman\u2019s dynamic programming principle (Bellman, 1952).\nSpeci\ufb01cally, we iteratively compute the expected cost of continue and stop as a function of the belief\nstate bt (these are the Q-factors for continue and stop, Qc(bt) and Qs(bt)). If Qc(bt) < Qs(bt), then\nthe optimal policy chooses to continue; otherwise, it chooses to stop; therefore, the belief state is\npartitioned by the decision policy into a continuation region and a stopping region (details omitted\ndue to lack of space).\nThe principal difference between the two tasks as formulated here is the loss function. In the 2AFC\ntask, all trials are terminated by a response (unless the response deadline is exceeded). However, in\nthe GNG version, subjects have to wait until the response deadline to choose the NoGo response.\nThis introduces a signi\ufb01cant, extra cost of time for NoGo responses, suggesting that it may in some\ncases be better to select the Go response despite the relative inadequacy of sensory evidence. We\nexplore these aspects in detail in the following section.\n\n3 Results\n\nOpportunity cost and the Go/NoGo decision threshold\nFigure 2A illustrates the difference between the optimal decision policies for the two tasks. The red\nlines (solid: 2AFC, dashed: GNG) illustrates the optimal decision thresholds, which, when exceeded\nby the cumulative sensory evidence bt, generate the corresponding response, as a function of time.\nFor the 2AFC task, the optimal policy is a pair of thresholds that are initially fairly constant over\ntime, but then collapse toward each other (into an empty set if the cost of exceeding the deadline\nis suf\ufb01ciently large) as the deadline approaches (cf. [29]). In contrast, the threshold for the GNG\n\n5\n\n20508000.10.20.30.4Error rateData: Error rate  gonogo2050800100200300400RT (ms)Data: RT  goFA20508000.10.20.30.4Error rateModel: Error rate  gonogo2050800510RT (timesteps)Model: RT  goFAABDC% Nogo trials% Nogo trials% Nogo trials% Nogo trials\fFigure 4: Drift-diffusion model (DDM) for 2AFC and GNG tasks (A) A simpli\ufb01ed version of the\nDDM for 2-choice tasks, where a noisy accumulation process with a certain rate produces one of\ntwo responses when it reaches a positive or negative threshold. In addition to the rate and threshold\nparameters, a third parameter (the temporal offset to the start of the accumulation process) represents\nthe nondecision processes associated with visual and motor delays. (B) DDM \ufb01ts to 2AFC and GNG\nchoice data(Gomez et al., 2007, Mack & Palmeri, 2010) suggest that the GNG task is associated\nwith a higher threshold and shorter offset than the 2AFC task. (C) Optimal decision-making model\npredicts a lower, time-varying threshold for the GNG task.\n\ntask (dotted line) is a single threshold that varies over time, and is lower at the beginning of the\ntrial. This is a direct consequence of the opportunity cost involved with waiting until the deadline:\nif the deadline is far away, the cost of waiting may be more than the cost of an immediate error that\nterminates the trial; indeed, we expect that the farther away the deadline, the greater temporal cost\nsavings conferred by Go response over waiting to register the NoGo response.\nDecision-making in 2AFC and GNG tasks\nFigure 2B;C shows the effect of the time-varying threshold on RT and accuracy in an example model\nsimulation. Figure 2B shows that the GNG model is signi\ufb01cantly biased towards the Go response,\nwith a higher fraction of false alarms than misses. This asymmetry is absent in the 2AFC model\nperformance. In addition, GNG response times are faster than 2AFC response times (Figure 2C).\nThis bias is a direct result of the time-varying threshold in the GNG task; early on in the trial, the\ndecision threshold is lower, and produces fast, error-prone responses.\nThis model prediction is consistent with data from human perceptual decision-making. Figure 1\nshows behavioral data in the two tasks [16]\u2013 subjects determined from a brief presentation of a\nnoisy visual stimulus whether or not the image contained an animal. The same task was performed\nin two response conditions: 2AFC, where each stimulus required a yes/no response, and GNG,\nwhere subjects only responded to image containing the target. Figure 1A shows that in the 2AFC\ncondition, subjects are not signi\ufb01cantly biased towards either response, with both false alarms and\nmiss rates being similar to each other. On the other hand, in the Go/NoGo condition, subjects\nshowed a signi\ufb01cant bias towards the overt response, thus producing substantially more false alarms\nand fewer misses. In the GNG task, their RT was signi\ufb01cantly shorter than in the 2AFC task (Figure\n1B). Similar results have also been reported by Gomez et al.\nin the context of lexical decision-\nmaking [2].\nIn\ufb02uence of stimulus probability on Go bias\nWe investigate the degree of Go bias in the GNG model by considering the effect of trial type fre-\nquency on behavioral measures in the GNG task. Model simulations (Figure 3) show that, consistent\nwith Figure 2 and a host of other experimental data, there is a signi\ufb01cant bias toward the Go response\nwhen Go and NoGo trials are equiprobable, and this bias is increased (respectively diminished) as\nNoGo trials are fewer or more frequent. The \ufb01gure also shows that RT for both correct Go and erro-\nneous NoGo responses increase with the frequency of NoGo trials, and that false alarm RT is faster\nthan correct response RT. In recent work, Nieuwenhuis et al. [28] used a block design to compare\nchoice accuracy and RT in a letter discrimination task when the fraction of NoGo trials was set to\n20%, 50%, and 80%. As shown in Figure 3C;D, , subjects\u2019 behavior was reliably modulated by trial\ntype frequency, in a manner closely re\ufb02ecting model predictions.\n\n6\n\n\fFigure 5: DDM approximation to optimal decision-making model. Simpli\ufb01ed DDMs were \ufb01t to op-\ntimal model simulations of 2AFC and GNG behavior, and the best-\ufb01t parameters compared between\ntasks. The DDM approximation for optimal GNG behavior shows a higher decision threshold (B),\nand lower nondecision time (C), than the DDM approximation for the 2AFC task. In addition, the\nrate of evidence accumulation was also lower for the GNG \ufb01t (A).\n\nIn our formulation, although the decision boundary is unchanged by the experimental manipulation,\nthe stimulus frequency induces a prior belief over the identity of the stimulus, and thus represents\nthe starting point for the evidence accumulation process. When Go trials are rare, the starting point\nis far from the decision boundary, and it takes longer for a response to be generated. Further, due to\nthe extra evidence needed to overcome the prior, choices are less likely to be erroneous.\nDrift-diffusion models and optimal behavior\nVarious versions of augmented DDM have been used to \ufb01t GNG behavioral data, with one variant\nin particular suggesting that the decision threshold in GNG ought to be higher than 2AFC [2], in\nan apparent contradiction to our model\u2019s predictions (Figure 4). By \ufb01tting RT and choice data from\nlexical judgment, numerosity judgment, and memory-based decision making tasks, Gomez et al. [2]\nfound that a DDM with an implicit negative boundary associated with the NoGo stimulus provided\na good \ufb01t to RT data. Further, joint parameter \ufb01ts to 2AFC and GNG choice data indicated that the\nprincipal difference in the two tasks was in the nondecision time and decision threshold; the rate\nparameter (representing the evidence accumulation process) was similar in both tasks. In particular,\nthey suggested that the nondecision time was shorter, and the decision threshold higher than in\nthe 2AFC task (Figure 4B). These results were replicated by Mack & Palmeri by \ufb01tting DDM to\nbehavioral data from a visual categorization task performed in both 2AFC and GNG versions [12].\nAlthough DDMs are formally equivalent to optimal decision-making in a restricted class of sequen-\ntial choice problems [18], they do not explicitly represent and manipulate uncertainty and cost, as we\ndo in our Bayesian risk-minimization framework. In particular, our framework allows us to predict\nthat optimal behavior is well-characterized by a DDM with a time-varying threshold (Figure 4C),\nand that the restricted class of constant-threshold DDMs are insuf\ufb01cient to fully explain observed\nbehavior. Nevertheless, we can ask whether our prediction is consistent with the empirical results\nobtained from DDM \ufb01ts with constant decision thresholds.\nTo address this, we computed the best constant-threshold DDM approximations to optimal decision\nmaking in the two tasks. We simulated the optimal model with a shared set of parameters for both\nthe 2AFC and GNG tasks, and \ufb01t simpli\ufb01ed random-walk models with 3 free parameters (Figure\n4A) to the output of our optimal model\u2019s simulations. Figure 5 shows that the best-\ufb01tting DDM\napproximation for optimal GNG behavior has a higher threshold and a lower offset parameter than\nthe best-\ufb01tting DDM for optimal 2AFC task behavior.\nNote that varying the magnitude of a symmetric (explicit and implicit) decision threshold is not ca-\npable of explaining the go bias towards the overt response. Gomez et al. also considered additional\nvariants of the DDM which allow for a change in the initial starting point, and for a different ac-\ncumulation rate in the GNG task. These models, when \ufb01t to data, showed a bias towards the overt\nresponse; however, the quality of \ufb01t did not signi\ufb01cantly improve [2] .\nThus, our results and those of Gomez et al. [2] are conceptually consistent; a prinicipal difference\nin the two tasks is the decision threshold, whereas the evidence accumulation process is similar\nacross tasks. However, our analysis explains precisely how and why the thresholds in the two tasks\nare different: the GNG task has a time-varying threshold that is lower than the 2-choice threshold,\n\n7\n\nrate00.020.040.060.080.1  thresh00.511.5  offset0246810  2AFCGo\u2212nogo\fdue to the difference in loss functions in the two tasks. In particular, our model accounts for the\nbias towards the overt response, without recourse to an implicit decision boundary or additional\nparameter changes. When optimal behavior is approximated by a simpler class of models (e.g.,\nmodels with \ufb01xed decision threshold), the best \ufb01t to optimal GNG behavior turns out to be a higher\nthreshold and shorter nondecision time, as found by previous work [2, 12], and adjustments to the\ninitial starting point are required to explain the overt response bias.\n\n4 Discussion\n\nForcing a choice between two alternatives is a fundamental technique used to study a wide variety\nof perceptual and cognitive phenomena, but there has long been confusion over whether GNG and\n2AFC variants of such tasks are probing the same underlying neural and cognitive processes. Our\nwork demonstrates that a common Bayes-optimal sequential inference and decision policy can ex-\nplain the behavioral results in both tasks, as well as what was perceived to be a troubling Go bias in\nthe GNG task, compared to 2AFC. We showed that the Go bias arises naturally as a rational response\nto the asymmetric time cost between Go and NoGo responses, as the former immediately terminates\nthe trial, while the latter requires the subject to wait until the end of the trial to record the choice.\nThe consequence of this cost asymmetry is an optimal decision policy that requires Bayesian evi-\ndence accumulation up to a time-varying boundary, which has an inverted-U shape: the initial low\nboundary is due to the temporal advantage of choosing to Go early and save on the time necessary\nto wait to register a NoGo response, the later collapsing of boundary is due to the expectation of the\ndeadline for responding. We showed that this optimal decision policy accounts for the general be-\nhavioral phenomena observed in GNG tasks, in particular accounting for the Go bias. Importantly,\nour work shows that need not be any fundamental differences in the cognitive and neural processes\nunderlying perception and decision-making in these tasks, at least not on account of the Go bias.\nOur model makes several novel experimental predictions for the GNG task: (1) for fast responses,\nfalse alarm rate increases as a function of response time (in contrast, the \ufb01xed-threshold DDM ap-\nproximation predicts a constant alarm rate); (2) lengthening the response deadline should exacerbate\nthe Go bias; (3) if GNG and 2AFC share a common inference and decision-making neural infras-\ntructure, then our model predicts within-subject cross-task correlation: e.g. favoring speed over\naccuracy in the 2AFC task should correlate with a greater Go bias in the GNG task.\nThe optimal decision policy for the GNG task can naturally be viewed as a stochastic process (though\nit is normatively derived from task statistics and behavioral goals). We can therefore compare our\nmodel to other stochastic process models previously proposed for the GNG task. Our model has\na single decision threshold associated with the overt response, consistent with some early models\nproposed for the task (see e.g., Sperling et al. [30]). In contrast, the extended DDM framework\nproposed by Gomez et al. has an additional boundary associated with the NoGo response (corre-\nsponding to a covert NoGo response). Gomez et al. report that single-threshold variants of the DDM\nprovided very poor \ufb01ts to the data. Although computationally and behaviorally we do not require\na covert-response or associated threshold, it is nevertheless possible that neural implementations\nof behavior in the task may involve an explicit \u201cNoGo\u201d choice For instance, substantial empirical\nwork aims to isolate neural correlates of restraint, corresponding to a putative \u201cNoGo\u201d action, by\ncontrasting neural activity on \u201cgo\u201d and \u201cnogo\u201d (see e.g., [31, 32]). We will consider approximating\nthe optimal policy with one that includes this second boundary in future work.\n\n8\n\n\fReferences\n[1] R Ratcliff and P L Smith. Psychol. Rev., 111:333\u2013346, 2004.\n[2] P Gomez, R Ratcliff, and M Perea. Journal of Experimental Psychology, 136(3):389\u2013413,\n\n2007.\n\n[3] M Usher and J L McClelland. Psychol. Rev., 108(3):550\u2013592, 2001.\n[4] R. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J.D. Cohen. Psychological Review,\n\n113(4):700, 2006.\n\n[5] R.D. Luce. Number 8. Oxford University Press, USA, 1991.\n[6] F.C. Donders. Acta Psychologica, 30:412, 1969.\n[7] B. Gordon and A. Caramazza. Brain and Language, 15(1):143\u2013160, 1982.\n[8] Y Hino and SJ Lupker. Journal of experimental psychology. Human perception and perfor-\n\nmance, 26:166\u2013183, 2000.\n\n[9] M Perea, E Rosa, and C Gomez. Memory and Cognition, 30(34-45), 2002.\n[10] S. Thorpe, D. Fize, C. Marlot, and Others. Nature, 381(6582):520\u2013522, 1996.\n[11] A. Delorme, G. Richard, and M. Fabre-Thorpe. Vision Research, 40(16):2187\u20132200, 2000.\n[12] ML Mack and TJ Palmeri. Journal of Vision, 10:1\u201311, 2010.\n[13] M.A. Sommer and R.H. Wurtz. J Neurophysiol., 85(4):1673\u20131685, 2001.\n[14] RP Hasegawa, BW Peterson, and ME Goldberg. Neuron, 43(3):415\u201325, August 2004.\n[15] G Aston-Jones, J Rajkowski, and P Kubiak. J Neurosci., 14:4467\u20134480, 1994.\n[16] N. Bacon-Mac\u00b4e, H. Kirchner, M. Fabre-Thorpe, and S.J. Thorpe. J Exp. Psychol.: Human\n\nPerception and Performance, 33(5):1013, 2007.\n\n[17] A Wald. Dover publications, 1947.\n[18] A. Wald and J. Wolfowitz. The Annals of Mathematical Statistics, 19(3):326\u2013339, 1948.\n[19] J.D. Roitman and M.N. Shadlen. J neurosci., 22(21):9475, 2002.\n[20] J.I. Gold and M.N. Shadlen. Neuron, 36(2):299\u2013308, 2002.\n[21] M. Stone. Psychometrika, 25(3):251\u2013260, 1960.\n[22] D.R.J. Laming. Academic Press, 1968.\n[23] R. Ratcliff. Psychological Review, 85(2):59, 1978.\n[24] J.I. Gold and M.N. Shadlen. Annu. Rev. Neurosci., 30:535\u2013574, 2007.\n[25] D.P. Hanes and J.D. Schall. Science, 274(5286):427, 1996.\n[26] M.E. Mazurek, J.D. Roitman, J. Ditterich, and M.N. Shadlen. Cerebral cortex, 13(11):1257,\n\n2003.\n\n[27] R Ratcliff, A Cherian, and M Segraves. Journal of neurophysiology, 90:1392\u20131407, 2003.\n[28] S Nieuwenhuis, N Yeung, W van den Wildenberg, and KR Ridderinkhof. Cognitive, affective\n\n& behavioral neuroscience, 3(1):17\u201326, March 2003.\n\n[29] P. Frazier and A.J. Yu. Advances in neural information processing systems, 20:465\u2013472, 2008.\n[30] G. Sperling and B. Dosher. Handbook of perception and human performance., 1:2\u20131, 1986.\n[31] D.J. Simmonds, J.J. Pekar, and S.H. Mostofsky. Neuropsychologia, 46(1):224\u2013232, 2008.\n[32] A.R. Aron, S. Durston, D.M. Eagle, G.D. Logan, C.M. Stinear, and V. Stuphorn. The Journal\n\nof Neuroscience, 27(44):11860\u201311864, 2007.\n\n9\n\n\f", "award": [], "sourceid": 1038, "authors": [{"given_name": "Pradeep", "family_name": "Shenoy", "institution": null}, {"given_name": "Angela", "family_name": "Yu", "institution": null}]}