{"title": "Temporal Dynamics of Cognitive Control", "book": "Advances in Neural Information Processing Systems", "page_first": 1353, "page_last": 1360, "abstract": "Cognitive control refers to the flexible deployment of memory and attention in response to task demands and current goals. Control is often studied experimentally by presenting sequences of stimuli, some demanding a response, and others modulating the stimulus-response mapping. In these tasks, participants must maintain information about the current stimulus-response mapping in working memory. Prominent theories of cognitive control use recurrent neural nets to implement working memory, and optimize memory utilization via reinforcement learning. We present a novel perspective on cognitive control in which working memory representations are intrinsically probabilistic, and control operations that maintain and update working memory are dynamically determined via probabilistic inference. We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing. Moreover, our model provides insight into how task instructions can be directly translated into appropriate behavior and then efficiently refined with subsequent task experience.", "full_text": "Temporal Dynamics of Cognitive Control\n\nJeremy R. Reynolds\n\nDepartment of Psychology\n\nUniversity of Denver\nDenver, CO 80208\n\njeremy.reynolds@psy.du.edu\n\nMichael C. Mozer\n\nDepartment of Computer Science and\n\nInstitute of Cognitive Science\n\nUniversity of Colorado\n\nBoulder, CO 80309\n\nmozer@colorado.edu\n\nAbstract\n\nCognitive control refers to the \ufb02exible deployment of memory and attention in re-\nsponse to task demands and current goals. Control is often studied experimentally\nby presenting sequences of stimuli, some demanding a response, and others mod-\nulating the stimulus-response mapping. In these tasks, participants must maintain\ninformation about the current stimulus-response mapping in working memory.\nProminent theories of cognitive control use recurrent neural nets to implement\nworking memory, and optimize memory utilization via reinforcement learning.\nWe present a novel perspective on cognitive control in which working memory\nrepresentations are intrinsically probabilistic, and control operations that maintain\nand update working memory are dynamically determined via probabilistic infer-\nence. We show that our model provides a parsimonious account of behavioral and\nneuroimaging data, and suggest that it offers an elegant conceptualization of con-\ntrol in which behavior can be cast as optimal, subject to limitations on learning\nand the rate of information processing. Moreover, our model provides insight into\nhow task instructions can be directly translated into appropriate behavior and then\nef\ufb01ciently re\ufb01ned with subsequent task experience.\n\n1 Introduction\n\nCognitive control can be characterized as the ability to guide behavior according to current goals\nand plans. Control often involves overriding default or overlearned behaviors. Classic examples of\nexperimental tasks requiring this ability include Stroop, Wisconsin card sorting, and task switch-\ning (for a review, see [1]). Although these paradigms vary in super\ufb01cial features, they share the\nkey underlying property that successful performance involves updating and maintaining a task set.\nThe task set holds the information required for successful performance, e.g., the stimulus-response\nmapping, or the dimension along which stimuli are to be classi\ufb01ed or reported. For example, in\nWisconsin card sorting, participants are asked to classify cards with varying numbers of instances of\na colored symbol. The classi\ufb01cation might be based on color, symbol, or numerosity; instructions\nrequire participants to identify the current dimension through trial and error, and perform the ap-\npropriate classi\ufb01cation until the dimension switches after some unspeci\ufb01ed number of trials. Thus,\nit requires participants to maintain a task set\u2014the classi\ufb01cation dimension\u2014in working memory\n(WM). Likewise, in the Stroop task, stimuli are color names presented in various ink colors, and the\ntask set speci\ufb01es whether the color is to be named or the word is to be read.\nTo understand cognitive control, we need to characterize the brain\u2019s policy for updating, maintaining,\nand utilizing task set. Moreover, we need to develop theories of how task instructions are translated\ninto a policy, and how this policy is re\ufb01ned with subsequent experience performing a task.\n\n1\n\n\f1.1 Current Computational Theories of Control\n\nFrom a purely computational perspective, control is not a great challenge. Every computer program\nmodulates its execution based on internal state variables. The earliest psychological theories of con-\ntrol had this \ufb02avor: Higher cognitive function was conceived of as a logical symbol system whose\nvariables could be arbitrarily bound [2], allowing for instructions to be used appropriately\u2014and\nperfectly\u2014to update representations that support task performance. For example, in the Wisconsin\ncard sorting task, the control instruction\u2014the classi\ufb01cation dimension\u2014would be bound to a vari-\nable, and responses would be produced by rules of the form, \u201cIf the current dimension is D and\nthe stimulus is X, respond Y\u201d. Behavioral data indicate that this naive computational perspective is\nunlikely to be how control is implemented in the brain. Consider the following phenomena:\n\n\u2022 When participants are asked to switch tasks, performance on the \ufb01rst trial following a\nswitch is inef\ufb01cient, although performance on subsequent trials is ef\ufb01cient, suggesting that\nloading a new task set depends on actually performing the new task [3]. This \ufb01nding\nis observed even for very simple tasks, and even when the switches are regular, highly\npredictable, and well practiced.\n\n\u2022 Switch costs are asymmetric, such that switching from an easy task to a dif\ufb01cult task is\n\neasier than vice-versa [4].\n\n\u2022 Some task sets are more dif\ufb01cult to implement than others. For example, in the Stroop task,\n\nreading the word is quick and accurate, but naming the ink color is not [5].\n\n\u2022 The dif\ufb01culty of a particular task depends not only on the characteristics of the task itself,\n\nbut also on context in which participants might be called upon to perform [6].\n\nTo account for phenomena such as these, theories of control have in recent years focused on how\ncontrol can be implemented in cortical neural networks.\nIn the prevailing neural-network-based\ntheory, task set is represented in an activity-based memory system, i.e., a population of neurons\nwhose recurrent activity maintains the representation over time. This active memory, posited to\nreside in prefrontal cortex (PFC), serves to bias ongoing processing in posterior cortical regions to\nachieve \ufb02exibility and arbitrary, task-dependent stimulus-response mappings (for review, see [1]).\nFor example, in the Stroop task, instructions to report the ink color might bias the neural population\nrepresenting colors\u2014i.e., increase their baseline activity prior to stimulus onset\u2014such that when\nstimulus information arrives, it will reach threshold more rapidly, and will beat out the neural pop-\nulation that represents word orthography in triggering response systems [7]. In this framework, a\ncontrol policy must specify the updating and maintenance task set, which involves when to gate new\nrepresentations into WM and the strength of the recurrent connection that maintains the memory.\nFurther, the policy must specify which WM populations bias which posterior representations, and\nthe degree to which biasing is required. Some modelers have simply speci\ufb01ed the policy by hand\n[8], whereas most pretrain the model to perform a task\u2014in a manner meant to re\ufb02ect long-term\nlearning prior to experimental testing [7, 9, 10].\nThese models provide an account for a range of neurophysiological and behavioral data. However,\nthey might be criticized on a number of grounds. First, like their symbolic predecessors, the neural\nnetwork models must often be crippled arbitrarily to explain data; for example, by limiting the\nstrength of recurrent memory connections, the models obtain task set decay and can explain error\ndata. Second, the models require a stage of training which is far more akin to how a monkey\nlearns to perform a task than to how people follow task instructions. The reinforcement-learning\nbased models require a long stage of trial-and-error learning before the appropriate control policy\nemerges. Whereas monkeys are often trained for months prior to testing, a notable characteristic of\nhumans is that they can perform a task adequately on the \ufb01rst trial from task instructions [11].\n\n2 Control as Inference\n\nOur work aims to provide an alternative, principled conceptualization of cognitive control. Our goal\nis to develop an elegant theoretical framework with few free parameters that can easily be applied\nto a wide range of experimental tasks. With strong computational and algorithmic constraints, our\nframework has few degrees of freedom, and consequently, makes strong, experimentally veri\ufb01able\n\n2\n\n\fpredictions. Additionally, as a more abstract framework than the neural net theories, one aim is to\nprovide insight as to how task instructions can be used directly and immediately to control behavior.\nA fundamental departure of our approach from previous approaches is to consider WM as inherently\nprobabilistic. That is, instead of proposing that task set is stored in an all-or-none fashion, we wish\nto allow for task set\u2014as well as all cortical representations\u2014to be treated as random variables. This\nnotion is motivated by computational neuroscience models showing how population codes can be\nused to compute under uncertainty [12].\nGiven inherently probabilistic representations, it is natural to treat the problems of task set updating,\nmaintenance and utilization as probabilistic inference. To provide an intuition about our approach,\nconsider this scenario. I will walk around my house and tell you what objects I see. Your job is\nto guess what I\u2019ll report next. Suppose I report the following sequence: REFRIGERATOR, STOVE,\nSINK, TOILET, SHOWER, DRESSER. To guess what I\u2019m likely to see next, you need to infer what\nroom I am in. Even though the room is a latent variable, it can be inferred from the sequence of\nobservations. At some points in the sequence, the room can be determined with great con\ufb01dence\n(e.g., after seeing TOILET and SHOWER). At other times, the room is ambiguous (e.g., following\nSINK), and only weak inferences can be drawn.\nBy analogy, our approach to cognitive control treats task set as a latent variable that must be inferred\nfrom observations. The observations consist of stimulus-response-feedback triples.Sometimes the\nobservations will strongly constrain the task set, as in the Stroop task when the word GREEN is\nshown in color red, and the correct response is red, or when an explicit instruction is given to report\nthe ink color; but other times the observations provide little constraint, as when the word RED is\nshown in color red, and the correct response is red. One inference problem is therefore to determine\ntask set from the stimulus-response sequence. A second, distinct inference problem is to determine\nthe correct response on the current trial from the current stimulus and the trial history. Thus, in our\napproach, control and response selection are cast as inference under uncertainty.\nIn this paper, we \ufb02esh out a model based on this approach. We use the model to account for behav-\nioral data from two experiments. Each experiment involves a complex task environment in which\nexperimental participants are required to switch among eight tasks that have different degrees of\noverlap and inconsistency with one another. Having constrained the model by \ufb01tting behavioral\ndata, we then show that the model can explain neuroimaging data. Moreover, the model provides\na different interpretation to these data than has been suggested previously. Beyond accounting for\ndata, the model provides an elegant theoretical framework in which control and response selection\ncan be cast as optimal, subject to limitations on the processing architecture.\n\n3 Methods\n\nOur model addresses data from two experiments conducted by Koechlin, Ody, and Kouneiher [6]. In\neach experiment, participants are shown blocks of 12 trials, preceded by a cue that indicates which\nof the eight tasks is to be performed with the stimuli in that block. The task speci\ufb01es a stimulus-\nresponse mapping. The stimuli in Experiments 1 and 2 are colored squares and colored letters,\nrespectively. Examples of the sequence of cues and stimuli for the two experiments is shown in\nFigure 1A. In both experiments, there are two potential responses.\nThe stimulus-response mappings for Experiment 1 are shown in the eight numbered boxes of Fig-\nure 1C. (The layout of the boxes will be explained shortly.) Consider task 3 in the upper left corner\nof the Figure. The notation indicates that task 3 requires a left response to the green square, a right\nresponse to a red square, and no response (hereafter, no-go) to a white square. Task 4 is identical to\ntask 3, and the duplication is included because the tasks are described as distinct to participants and\neach is associated with a unique task cue. The duplication makes the stimulus-response mapping\ntwice as likely, because the eight tasks have uniform priors. Task 1 (lower left corner of the \ufb01gure)\nrequires a left response for a green square and no-go for a white square. There are no red stimuli in\nthe task 1 blocks, and the green\u2192left mapping is depicted twice to indicate that the probability of a\ngreen square appearing in the block is twice that of a white square.\nWe now explain the 3 \u00d7 2 arrangement of cells in Figure 1C. First the rows. The four tasks in\nthe lower row allow for only one possible response (not counting no-go as a response), whereas\nthe four tasks in the upper row demand that a choice be made between two possible responses.\n\n3\n\n\fFigure 1: (A) Examples of stimulus sequences from Exp. 1 and 2 (top and bottom arrows, respec-\ntively) of [6]. (B) Eight tasks in Exp. 2, adapted from [6]. (C) Eight tasks in Exp. 1. (D) Response\ntimes from participants in Exp. 1 and 2 (white and black points, respectively). The data points cor-\nrespond to the \ufb01lled grey cells of (B) and (C), and appear in homologous locations. X-axis of graph\ncorresponds to columns of the 3\u00d72 array of cells in (B) and (C); squares and circles correspond to\ntop and bottom row of each 3\u00d72 array. (E) Simulation results from the model.\n\nThus, the two rows differ in terms of the demands placed on response selection. The three columns\ndiffer in the importance of the task identity. In the leftmost column, task identity does not matter,\nbecause each mapping (e.g., green\u2192left) is consistent irrespective of the task identity. In contrast,\ntasks utilizing yellow, blue, and cyan stimuli involve varied mappings. For example, yellow maps\nto left in two tasks, to right in one task, and to no-go in one task. The tasks in the middle column\nare somewhat less dependent on task identity, because the stimulus-response mappings called for\nhave the highest prior. Thus, the three columns represent a continuum along which the importance\nof task identity varies, from being completely irrelevant (left column) to being critical for correct\nperformance (right column). Empty cells within the grid are conceptually possible, but were omitted\nfrom the experiment.\nExperiment 2 has the same structure as Experiment 1 (Figure 1B), with an extra level of complexity.\nRather than mapping a color to a response, the color determines which property of the stimulus is to\nbe used to select a response. For example, task 3 of Figure 1B demands that a green letter stimulus\n(denoted as X here) be classi\ufb01ed as a vowel or consonant (property P1), whereas a red letter stimulus\nbe classi\ufb01ed as upper or lower case (property P2). Thus, Experiment 2 places additional demands\nof stimulus classi\ufb01cation and selection of the appropriate stimulus dimension.\nParticipants in each experiment received extensive practice on the eight tasks before being tested.\nTesting involved presenting each task following each other task, for a total of 64 test blocks.\n\n3.1 A Probabilistic Generative Model of Control Tasks\n\nFollowing the style of many probabilistic models in cognitive science, we have designed a generative\nmodel of the domain, and then invert the model to perform recognition via Bayesian inference. In\nour case, the generative model is of the control task, i.e., the model produces sequences of stimulus-\nresponse pairs such that the actual trial sequence would be generated with high probability. Instead\nof learning this model from data, though, we assume that task instructions are \u2019programmed\u2019 into\nthe model.\nOur generative model of control tasks is sketched in Figure 2A as a dynamical Bayes net. Vertical\nslices of the model represent the trial sequence, with the subscript denoting the trial index. First\nwe explain the nodes and dependencies and then describe the conditional probability distributions\n(CPDs).\nThe B node represents the task associated with the current block of trials. (We use the term \u2019block\u2019\nas shorthand notation for this task.) The block on trial k has 8 possible values in the experiments we\n\n4\n\nP1P2P1P234P1P1P2P212P1P2P1P278P1P1P2P256XXXXXXXXXXXXXXXXXXXXXXXXX: {A,E,I,O,a,e,i,o,C,G,K,P,c,g,k,p}P1: vowel/consonant; P2: Upper/lower case discrimination tasksABCLRLR34LLRR12LRLR78LLRR56L: left response; R: right responsetimeC3C7C3C5OkGcelEpKaCigPDE\fBk-1\n\nCk-1\n\nRk-1\n\nSk-1\n\nT\n\nBk\n\nCk\n\nSk\n\nT\n\nBk+1\n\nRk\n\nCk+1\n\nRk+1\n\nSk+1\n\nT\n\nFigure 2: Dynamical Bayes net depiction of our generative model of control tasks, showing the\ntrial-to-trial structure of the model.\n\nmodel, and its value depends on the block on trial k (cid:31) 1. The block determines the category of the\nstimulus, C, which in turn determines the stimulus identity, S. The categories relevant to the present\nexperiments are: color label, block cue (the cue that identi\ufb01es the task in the next block), upper/lower\ncase for letters, and consonant/vowel for letters. The stimuli correspond to instantiations of these\ncategories, e.g., the letter Q which is an instance of an upper case consonant. Finally, the R node\ndenotes the response, which depends both on the current stimulus category and the current block.\nThis description of the model is approximate for two reasons. First, we decompose the category and\nstimulus representations into shape and color dimensions, expanding C into C color and C shape, and\nS into Scolor and Sshape. (When we refer to C or S without the superscript, it will denote both the\nshape and color components.) Second, we wish to model the temporal dynamics of a single trial,\nin order to explain response latencies. Although one could model the temporal dynamics as part of\nthe dynamical Bayes net architecture, we adopted a simpler and nearly equivalent approach, which\nis to explicitly represent time, T , within a trial, and to assume that in the generative model, stimulus\ninformation accumulates exponentially over time. With normalization of probabilities, this formu-\nlation is identical to a naive Bayes model with conditionally independent stimulus observations at\neach time step. With these two modi\ufb01cations, the slices of the network (indicated by the dashed\nrectangle in Figure 2A) are as depicted in Figure 2B.\nTo this point, we\u2019ve designed a generic model of any experimental paradigm involving context-\ndependent stimulus-response mappings. The context is provided by the block B, which is essentially\na memory that can be sustained over trials. To characterize a speci\ufb01c experiment, we must specify\nthe CPDs in the architecture. These distributions can be entirely determined by the experiment de-\nscription (embodied in Figure 1B,C). We toss in one twist to the model, which is to incorporate four\nparameters into the CPDs that permit us to specify aspects of the human cognitive architecture, as\nfollows: (cid:31), the degree of task knowledge (0: no knowledge; 1: perfect knowledge); (cid:30), the persis-\ntence of the block memory (0: memory decays completely from one trial to the next; 1: memory\nis perfect); and (cid:29)shape and (cid:29)color, the rate of transmission of shape and color information between\nstimulus and category representations. Given these parameters and the experiment description, we\ncan de\ufb01ne the CPDs in the model:\n(cid:149) P (Bk = b(cid:31)(cid:124) Bk(cid:30)1 = b) = (cid:28)b(cid:31) (cid:44)b (cid:30) + (1 (cid:31) (cid:30))(cid:47)NB, where (cid:28)(cid:46)(cid:44)(cid:46) is the Kronecker delta and NB is the\nnumber of distinct block (task) identities. This distribution is a mixture of a uniform distribution\n(no memory of block) and an identity mapping (perfect memory).\nk(cid:124) Bk) + (1 (cid:31) (cid:31))(cid:47)NCz, where z (cid:30) (cid:123) color(cid:44) shape(cid:125) and NCz is the number\nof distinct category values along dimension z, and P (cid:29)((cid:46)(cid:124) (cid:46)) is the probability distribution de\ufb01ned\nby the experiment and task (see Figure 2B,C). The mixture parameter, (cid:31), interpolates between\na uniform distribution (no knowledge of task) and a distribution that represents complete task\nknowledge.\n(cid:149) P (Rk(cid:124) Bk(cid:44) Ck) = (cid:31)P (cid:29)(Rk(cid:124) Bk(cid:44) Ck) + (1(cid:31) (cid:31))(cid:47)NR, where NR is the number of response alterna-\n(cid:149) P (Sz\nk = c(cid:44) T = t) (cid:29) (1 + (cid:29)zM z(s(cid:44) c))t, where z (cid:30) (cid:123) color(cid:44) shape(cid:125) and M z(s(cid:44) c)\nis a membership function that has value 1 if s is an instance of category c along dimension z,\nor 0 otherwise. By this CPD, the normalized probability for stimulus s grows exponentially to\n\nk(cid:124) Bk) = (cid:31)P (cid:29)(C z\n\n(cid:149) P (C z\n\ntives (including no-go).\n\nk = s(cid:124) C z\n\n5\n\n\fFigure 3: (top row) human neuroimaging data from three brain regions [6], (bottom row) entropy\nread out from three nodes of the model. Full explanation in the text.\n\nasymptote as a function of time t if s belongs to category c, and drops exponentially toward zero\nif s does not belong to c.\n\nThis formulation encodes the experiment description\u2014as represented by the P \u2217(.) probabilities\u2014in\nthe model\u2019s CPDs, with smoothing via \u0001 to represent less-than-perfect knowledge of the experiment\ndescription.\nWe would like to read out from the model a response on some trial k, given the stimulus on trial k,\nSk, and a history of past stimulus-response pairs, Hk = {S1...Sk\u22121, R1...Rk\u22121}. (In the experi-\nments, subjects are well practiced and make few errors. Therefore, we assume the R\u2019s are correct\nor corrected responses.) The response we wish to read out consists of a choice and the number of\ntime steps required to make the choice. To simulate processing time within a trial, we search over\nT . Larger T correspond to more time for evidence to propagate in the model, which leads to lower\nentropy distributions over the hidden variables Ck and Rk. The model initiates a response when\none value of Rk passes a threshold \u03b8, i.e., when [maxr P (Rk = r|Sk, T, Hk)] > \u03b8. This yields the\nresponse time (RT)\n\n(cid:105)\n\n(cid:111)\n\n(cid:110)\nt|(cid:104)\n\nt\u2217 = min\n\nand the response r\u2217 = argmaxr P (Rk = r|Sk, T = t\u2217, Hk).\n\nP (Rk = r|Sk, T = t, Hk)\n\n> \u03b8\n\nmax\n\nr\n\n(1)\n\n4 Simulation Results\n\nWe simulated the model on a trial sequence like that in the human study. We obtained mean RTs and\nerror rates from the model in the four experimental conditions of the two experiments (see the \ufb01lled\ncells of Figure 1B,C). The model\u2019s \ufb01ve parameters\u2014\u0001, \u03bb, \u03b3shape, \u03b3color, and \u03b8\u2014were optimized to\nobtain the maximum correlation between the mean RTs obtained from the simulation (Equation 1)\nand the human data (Figure 1D). This optimization resulted in a correlation between human and\nsimulation RTs of 0.99 (compare Figure 1D and E), produced by parameter values \u0001 = 0.87, \u03bb =\n0.79, \u03b3shape= 0.34, \u03b3color= 0.88, and \u03b8 = 0.63.\nTo express simulation time in units of milliseconds\u2014 the measure of time collected in the human\ndata\u2014we allowed an af\ufb01ne transform, which includes two free parameters: an offset constant in-\ndicating the time required for early perceptual and late motor processes, which are not embodied\nin the model, and a scale constant to convert units of simulation time to milliseconds. With these\ntwo transformation parameters, the model had a total of seven parameters. The astute reader will\nnote that there are only eight data points to \ufb01t, and one should therefore not be impressed by a close\nmatch between simulation and data. However, our goal is to constrain model parameters with this\n\ufb01t, and then explore emergent properties of the resulting fully constrained model.\nOne indication of model robustness is how well the model generalizes to sequences of trials other\nthan the one on which it was optimized. Across 11 additional generalization runs, the correlation\nbetween model and empirical data remained high with low variability (\u00af\u03c1 = 0.97, \u03c3\u03c1 = 0.004).\nAnother indication of the robustness of the result is to determine how sensitive the model is to\nthe choice of parameters. If randomly selected parameters yield large correlations, then the model\narchitecture itself is responsible for the good \ufb01t, not the particular choice of parameters. To per-\nform this test, we excluded parameters ranges in which the model failed to respond reliably (i.e.,\n\n6\n\npremotor cortexHUMAN! MR SIgnal Exp. 1Exp. 2R nodeMODELEntropyposterior lateral PFC  SingleDualExp. 1Exp. 2Cshape nodeanterior lateral PFCB nodeImportance of Task Identity\fthe model never attained the response criterion of Equation 1), or in which the model produced no\nRT variation across conditions. These requirements led to parameter ranges of: 0.8 \u2264 \u0001 \u2264 0.98;\n0.1 \u2264 \u03b3color, \u03b3shape \u2264 1.5; 0.6 \u2264 \u03bb \u2264 0.98; 0.65 \u2264 \u03b8 \u2264 0.85. All randomly selected combina-\ntions of parameters in these ranges led to correlation values greater than 0.9, demonstrating that the\nqualitative \ufb01t between model and behavioral results was insensitive to parameter selection, and that\nthe structure of the model is largely responsible for the \ufb01t obtained.\nKoechlin, Ody, and Kouneiher [6] collected not only behavioral data, but also neuroimaging data\nthat identi\ufb01ed brain regions involved in control, and how these brain regions modulated their activa-\ntion across experimental manipulations. There were three manipulations in the experiments: (1) the\ndemand on response selection (varied along rows of Figure 1C), (2) the importance of task identity\n(varied along the three columns of both Figure 1B and 1C), and (3) the demand of stimulus clas-\nsi\ufb01cation and selection of stimulus dimensions (varied along rows of Figure 1B). The top row of\nFigure 3 shows effects of these experimental manipulations on the fMRI BOLD response of three\ndifferent brain regions.\nThe remarkable result obtained in our simulations is that we identi\ufb01ed three components of the\nmodel that produced signatures analogous to those of the fMRI BOLD response in three cortical\nareas. We hypothesized that neural (fMRI) activity in the brain might be related to the entropy of\nnodes in the model, on account of the fact that when entropy is high, many possibilities must be\nsimultaneously represented, which may lead to greater BOLD signal. Because fMRI techniques\nintroduce signi\ufb01cant blurring in time, any measure in the model corresponding to the fMRI signal\nwould need to be integrated over the time of a trial. We therefore computed the mean entropy of\neach model node over time T = 1...t\u2217 within a trial. We then averaged the entropy measure across\ntrials within a condition, precisely as we did the RTs. To compare these entropy measures to the\nimaging data, the value corresponding to the bottom left cell of each experiment array (see Figure\n1B and 1C) was subtracted from all of the conditions of that particular experiment. This subtraction\nwas performed because the nature of the MRI signal is relative, and these two cells form the baseline\nconditions within the empirical observations. After performing this normalization, the values for R\nand C shape were then collapsed across the columns in panels B and C of Figure 1, resulting in a\nbar for each row within each panel. Additionally, the values for B were then collapsed across the\nrows of each panel, resulting in a value for each column. The model entropy results are shown in\nthe bottom row of Figure 3, and comparison with the top row reveals an exact correspondence. We\nemphasize that these results are obtained with the model which was fully constrained by \ufb01tting the\nRT data. Thus, these results are emergent properties of the model.\nBased on functional neuroanatomy, the correspondence between model components and brain re-\ngions is quite natural. Starting with the left column of Figure 3, uncertainty in the model\u2019s response\ncorresponds to activity in premotor cortex. This activity is greater when the block calls for two dis-\ntinct responses than when it calls for one. In the middle column of Figure 3, the uncertainty of shape\ncategorization corresponds to activity in posterior lateral prefrontal cortex. This region is thought\nto be involved in the selection of task-relevant information, which is consistent with the nature of\nthe current conditions that produce increases. In the right column of Figure 3, the uncertainty of the\ntask identity (block) in the model corresponds to activity in anterior lateral PFC, a brain region near\nareas known to be involved in WM maintenance. Interestingly, the lower the entropy the higher the\nneural activity, in contrast to the other two regions. There is a natural explanation for this inver-\nsion, though: entropy is high in the block node when the block representation matters the least, i.e.,\nwhen the stimulus-response mapping does not depend on knowing the task identity. Thus, higher\nentropy of the block node actually connotes less information to be maintained due to the functional\nequivalence among classes.\n\n5 Discussion\n\nWe proposed a theoretical framework for understanding cognitive control which provides a parsimo-\nnious account of behavioral and neuroimaging data from two large experiments. These experiments\nare suf\ufb01ciently broad that they subsume several other experimental paradigms (e.g., Stroop, task\nswitching). Koechlin et al. [6] explain their \ufb01ndings in terms of a descriptive model that involves a\ncomplex hierarchy of control processes within prefrontal cortex. The explanation for the neuroimag-\ning data that emerges from our model is arguable simpler and more intuitive.\n\n7\n\n\fFigure 4: Task (block) representation over a sequence of trials that involves all eight task types.\n\nThe key insight that underlies our model is the notion that cortical representations are intrinsically\nprobabilistic. This notion is not too surprising to theorists in computational neuroscience, but it leads\nto a perspective that is novel within the \ufb01eld of control: that the all-or-none updating of WM can be\nreplaced with a probabilistic notion of updating, and the view that WM holds competing hypotheses\nin parallel. Framing WM in probabilistic terms also offers a principled explanation for why WM\nshould decay. The parameter \u03bb controls a tradeoff between the ability to hold information over time\nand the ability to update when new relevant information arrives. In contrast, many neural network\nmodels have two distinct parameters that control these aspects of memory.\nAnother novelty of our approach is the notion of that control results from dynamical inference pro-\ncesses, instead of being conceived of as resulting from long-term policy learning. Inference plays\na critical role on the WM (task identity) representation: WM is maintained not solely from internal\nprocesses (e.g., the recurrent connections in a neural net), but is continually in\ufb02uenced by the ongo-\ning stream of stimuli via inference. The stimulus stream sometimes supports the WM representation\nand sometimes disrupts it. Figure 4 shows the trial-to-trial dynamics of the WM in our model. Note\nthat depending on the task, the memory looks quite different. When the stimulus-response pairs\nare ambiguous as to the task, the representation becomes less certain. Fortunately for the model\u2019s\nperformance, this is exactly the circumstance in which remembering the task identity is least critical.\nFigure 4 also points to a promising future direction for the model. The stream of trials clearly\nshows strong sequential effects. We are currently pursuing opportunities to examine the model\u2019s\npredictions regarding performance on the \ufb01rst trial in a block versus subsequent trials. The model\nshows an effect observed in the task switching literature: initial trial performance is poor, but control\nrapidly tunes to the task and subsequent trials are more ef\ufb01cient and roughly comparable.\nOur model seems to have surprisingly strong predictive power. This power comes about from the\nfact that the model expresses a form of bounded rationality: the model encodes the structure of the\ntask, subject to limitations on memory, learning, and the rate of perceptual processing. Exploiting\nthis bounded rationalityleads to strong constraints, few free parameters, and the ability to extend the\nmodel to new tasks without introducing additional free parameters.\n\nReferences\n\n[1] E. K. Miller and J. D. Cohen. An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24:167\u2013202, 2001.\n\n[2] A. Newell and H. A. Simon. Human Problem Solving. Prentice-Hall, Englewood Cliffs, NJ, 1972.\n\n[3] Robert D. Rogers and Stephen Monsell. Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124:207\u2013\n\n231, 1995.\n\n[4] Nick Yeung and Stephen Monsell. Switching between tasks of unequal familiarity: the role of stimulus-attribute and response-set selection. J Exp Psychol Hum\n\nPercept Perform, 29(2):455\u2013469, 2003.\n\n[5] C. M. MacLeod. Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin, 109:163\u2013203, 1991.\n\n[6] E. Koechlin, C. Ody, and F. Kouneiher. Neuroscience: The architecture of cognitive control in the human prefrontal cortex. Science, 424:1181\u20131184, 2003.\n\n[7]\n\nJ. D. Cohen, K. Dunbar, and J. L. McClelland. On the control of automatic processes: A parallel distributed processing model of the Stroop effect. Psychological\nReview, 97(3):332\u2013361, 1990.\n\n[8] S. J. Gilbert and T. Shallice. Task switching: A pdp model. Cognitive Psychology, 44:297\u2013337, 2002.\n\n[9] N. P. Rougier, D. Noelle, T. S. Braver, J. D. Cohen, and R. C. O\u2019Reilly. Prefrontal cortex and the \ufb02exibility of cognitive control: Rules without symbols.\n\nProceedings of the National Academy of Sciences, 102(20):7338\u20137343, 2005.\n\n[10] M. J. Frank and R. C. O\u2019Reilly. A mechanistic account of striatal dopamine function in human cognition: Psychopharmacological studies with cabergoline and\n\nhaloperidol. Behavioral Neuroscience, 120:497\u2013517, 2006.\n\n[11] Stephen Monsell. Control of mental processes. In V. Bruce, editor, Unsolved mysteries of the mind: Tutorial essays in cognition, pages 93\u2013148. Psychology\n\npress, Hove, UK, 1996.\n\n[12] R S Zemel, P Dayan, and A Pouget. Probabilistic interpretation of population codes. Neural Comput, 10(2):403\u2013430, 1998.\n\n8\n\n010203040506070809010000.51Trial Numberp(Bk)  12345678\f", "award": [], "sourceid": 254, "authors": [{"given_name": "Jeremy", "family_name": "Reynolds", "institution": null}, {"given_name": "Michael", "family_name": "Mozer", "institution": null}]}