{"title": "Principles of real-time computing with feedback applied to cortical microcircuit models", "book": "Advances in Neural Information Processing Systems", "page_first": 835, "page_last": 842, "abstract": "", "full_text": "Principles of real-time computing with feedback\n\napplied to cortical microcircuit models\n\nWolfgang Maass, Prashant Joshi\n\nInstitute for Theoretical Computer Science\n\nTechnische Universitaet Graz\n\nA-8010 Graz, Austria\n\nmaass,joshi@igi.tugraz.at\n\nEduardo D. Sontag\n\nDepartment of Mathematics\n\nRutgers, The State University of New Jersey\n\nPiscataway, NJ 08854-8019, USA\nsontag@cs.rutgers.edu\n\nAbstract\n\nThe network topology of neurons in the brain exhibits an abundance of\nfeedback connections, but the computational function of these feedback\nconnections is largely unknown. We present a computational theory that\ncharacterizes the gain in computational power achieved through feedback\nin dynamical systems with fading memory. It implies that many such\nsystems acquire through feedback universal computational capabilities\nfor analog computing with a non-fading memory. In particular, we show\nthat feedback enables such systems to process time-varying input streams\nin diverse ways according to rules that are implemented through internal\nstates of the dynamical system. In contrast to previous attractor-based\ncomputational models for neural networks, these \ufb02exible internal states\nare high-dimensional attractors of the circuit dynamics, that still allow\nthe circuit state to absorb new information from online input streams. In\nthis way one arrives at novel models for working memory, integration of\nevidence, and reward expectation in cortical circuits. We show that they\nare applicable to circuits of conductance-based Hodgkin-Huxley (HH)\nneurons with high levels of noise that re\ufb02ect experimental data on in-\nvivo conditions.\n\n1\n\nIntroduction\n\nQuite demanding real-time computations with fading memory1 can be carried out by\ngeneric cortical microcircuit models [1]. But many types of computations in the brain, for\n\n1A map (or \ufb01lter) F from input- to output streams is de\ufb01ned to have fading memory if its current\noutput at time t depends (up to some precision \") only on values of the input u during some \ufb01nite\ntime interval [t \u00a1 T; t]. In formulas: F has fading memory if there exists for every \" > 0 some\n\u2013 > 0 and T > 0 so that j(F u)(t) \u00a1 (F ~u)(t)j < \" for any t 2 R and any input functions u; ~u with\n\n\fexample computations that involve memory or persistent internal states, cannot be modeled\nby such fading memory systems. On the other hand concrete examples of arti\ufb01cial neural\nnetworks [2] and cortical microcircuit models [3] suggest that their computational power\ncan be enlarged through feedback from trained readouts. Furthermore the brain is known to\nhave an abundance of feedback connections on several levels: within cortical areas, where\npyramidal cells typically have in addition to their long projecting axon a number of local\naxon collaterals, between cortical areas, and between cortex and subcortical structures. But\nthe computational role of these feedback connections has remained open. We present here\na computational theory which characterizes the gain in computational power that a fading\nmemory system can acquire through feedback from trained readouts, both in the idealized\ncase without noise and in the case with noise. This theory simultaneously characterizes\nthe potential gain in computational power resulting from training a few neurons within a\ngeneric recurrent circuit for a speci\ufb01c task. Applications of this theory to cortical micro-\ncircuit models provide a new way of explaining the possibility of real-time processing of\nafferent input streams in the light of learning-induced internal circuit states that might rep-\nresent for example working memory or rules for the timing of behavior. Further details to\nthese results can be found in [4].\n\n2 Computational Theory\n\nRecurrent circuits of neurons are from a mathematical perspective special cases of dynam-\nical systems. The subsequent mathematical results show that a large variety of dynamical\nsystems, in particular also neural circuits, can overcome in the presence of feedback the\ncomputational limitations of a fading memory \u2013 without necessarily falling into the chaotic\nregime. In fact, feedback endows them with universal capabilities for analog computing,\nin a sense that can be made precise in the following way (see Fig. 1A-C for an illustration):\n\nTheorem 2.1 A large class Sn of systems of differential equations of the form\ni = 1; : : : ; n\n\ni(t) = fi(x1(t); : : : ; xn(t)) + gi(x1(t); : : : ; xn(t)) \u00a2 v(t);\n\nx0\n\n(1)\n\nare in the following sense universal for analog computing:\n\nIt can respond to an external input u(t) with the dynamics of any nth order differential\nequation of the form\n\nz(n)(t) = G(z(t); z 0(t); z 00(t); : : : ; z(n\u00a11)(t)) + u(t)\n\n(2)\n(for arbitrary smooth functions G : Rn ! R) if the input term v(t) is replaced by a suit-\nable memoryless feedback function K(x1(t); : : : ; xn(t); u(t)), and if a suitable memory-\nless readout function h(x1(t); : : : ; xn(t)) is applied to its internal state hx1(t); : : : ; xn(t)i.\nAlso the dynamic responses of all systems consisting of several higher order differential\nequations of the form (2) can be simulated by \ufb01xed systems of the form (1) with a corre-\nsponding number of feedbacks.\n\nThe class Sn of dynamical systems that become through feedback universal for analog\ncomputing subsumes2 systems of the form\n\nx0\n\ni(t) = \u00a1\u201aixi(t) + (cid:190) (\n\nn\n\nX\n\nj=1\n\naij \u00a2 xj(t)) + bi \u00a2 v(t) ;\n\ni = 1; : : : ; n\n\n(3)\n\nku(\u00bf ) \u00a1 ~u(\u00bf )k < \u2013 for all \u00bf 2 [t \u00a1 T; t]. This is a characteristic property of all \ufb01lters that can be\napproximated by an integral over the input stream u, or more generally by Volterra- or Wiener series.\n2for example if the \u201ai are pairwise different and aij = 0 for all i; j, and all bi are nonzero; fewer\n\nrestrictions are needed if more then one feedback to the system (3) can be used\n\n\f(A) A \ufb01xed circuit C with dynamics (1).\n\nFigure 1: Universal computational capability acquired through feedback according to The-\n(B) An arbitrary given nth order\norem 2.1.\ndynamical system (2) with external input u(t). (C) If the input v(t) to circuit C is replaced\nby a suitable feedback K(x(t); u(t)), then this \ufb01xed circuit C can simulate the dynamic\nresponse z(t) of the arbitrarily given system shown in B, for any input stream u(t).\n\nthat are commonly used to model the temporal evolution of \ufb01ring rates in neural circuits\n((cid:190) is some standard activation function). If the activation function (cid:190) is also applied to the\nterm v(t) in (3), the system (3) can still simulate arbitrary differential equations (2) with\nbounded inputs u(t) and bounded responses z(t); : : : ; z(n\u00a11)(t).\nNote that according to [5] all Turing machines can be simulated by systems of differential\nequations of the form (2). Hence the systems (1) become through feedback also universal\nfor digital computing. A proof of Theorem 2.1 is given in [4].\n\nIt has been shown that additive noise, even with an arbitrarily small bounded amplitude,\nreduces the non-fading memory capacity of any recurrent neural network to some \ufb01nite\nnumber of bits [6, 7]. Hence such network can no longer simulate arbitrary Turing ma-\nchines. But feedback can still endow noisy fading memory systems with the maximum\npossible computational power within this a-priori limitation. The following result shows\nthat in principle any \ufb01nite state machine (= deterministic \ufb01nite automaton), in particular\nany Turing machine with tapes of some arbitrary but \ufb01xed \ufb01nite length, can be emulated by\na fading memory system with feedback, in spite of noise in the system.\n\nTheorem 2.2 Feedback allows linear and nonlinear fading memory systems, even in the\npresence of additive noise with bounded amplitude, to employ the computational capability\nand non-fading states of any given \ufb01nite state machine (in addition to their fading memory)\nfor real-time processing of time varying inputs.\n\nThe precise formalization and the proof of this result (see [4]) are technically rather in-\nvolved, and cannot be given in this abstract. A key method of the proof, which makes\nsure that noise does not get ampli\ufb01ed through feedback, is also applied in the subsequent\ncomputer simulations of cortical microcircuit models. There the readout functions K that\nprovide feedback values K(x(t)) are trained to assume values which cancel the impact of\nerrors or imprecision in the values K(x(s)) of this feedback for immediately preceding\ntime steps s < t.\n\n3 Application to Generic Circuits of Noisy Neurons\n\nWe tested this computational theory on circuits consisting of 600 integrate-and-\ufb01re (I&F)\nneurons and circuits consisting of 600 conductance-based HH neurons, in either case with\n\n\fa rather high level of noise that re\ufb02ects experimental data on in-vivo conditions [8]. In\naddition we used models for dynamic synapses whose individual mixture of paired-pulse\ndepression and facilitation is based on experimental data [9, 10]. Sparse connectivity be-\ntween neurons with a biologically realistic bias towards short connections was generated by\na probabilistic rule, and synaptic parameters were randomly chosen, depending on the type\nof pre-and postsynaptic neurons, in accordance with these empirical data (see [1] or [4]\nfor details). External inputs and feedback from readouts were connected to populations of\nneurons within the circuit, with randomly varying connection strengths. The current circuit\nstate x(t) was modeled by low-pass \ufb01ltered spike trains from all neurons in the circuit (with\na time constant of 30 ms, modeling time constants of receptors and membrane of potential\nreadout neurons). Readout functions K(x(t)) were modeled by weighted sums w \u00a2 x(t)\n\nFigure 2: State-dependent real-time processing of 4 independent input streams in a generic cortical\n(A) 4 input streams, consisting each of 8 spike trains generated by Poisson\nmicrocircuit model.\nprocesses with randomly varying rates ri(t); i = 1; : : : ; 4 (rates plotted in (B); all rates are given\nin Hz). The 4 input streams and the feedback were injected into disjoint but densely interconnected\nsubpopulations of neurons in the circuit. (C) Resulting \ufb01ring activity of 100 out of the 600 I&F\nneurons in the circuit. Spikes from inhibitory neurons marked in gray. (D) Target activation times of\nthe high-dimensional attractor (gray shading), spike trains of 2 of the 8 I&F neurons that were trained\nto create the high-dimensional attractor by sending their output spike trains back into the circuit,\n(E and F) Performance of linear readouts\nand average \ufb01ring rate of all 8 neurons (lower trace).\nthat were trained to switch their real-time computation task depending on the current state of the\nhigh-dimensional attractor: output 2 \u00a2 r3(t) instead of r3(t) if the high-dimensional attractor is on\n(E), output r3(t) + r4(t) instead of jr3(t) \u00a1 r4(t)j if the high-dimensional attractor is on (F). (G)\nPerformance of linear readout that was trained to output r3(t) \u00a2 r4(t), showing that another linear\nreadout from the same circuit can simultaneously carry out nonlinear computations that are invariant\nto the current state of the high-dimensional attractor.\n\n\fwhose weights w were trained during 200 s of simulated biological time to minimize the\nmean squared error with regard to desired target output functions K. After training these\nweights w were \ufb01xed, and the performance of the otherwise generic circuit was evaluated\nfor new input streams u (with new input rates drawn from the same distribution) that had\nnot been used for training. It was suf\ufb01cient to use just linear functions K that transformed\nthe current circuit state x(t) into a feedback K(x(t)), con\ufb01rming the predictions of [1]\nand [2] that the recurrent circuit automatically assumes the role of a kernel (in the sense of\nmachine learning) that creates nonlinear combinations of recent inputs.\n\nWe found that computer simulations of such generic cortical microcircuit models con\ufb01rm\nthe theoretical prediction that feedback from suitably trained readouts enables complex\nstate-dependent real-time processing of a fairly large number of diverse input spike trains\nwithin a single circuit (all results shown are for test inputs that had not been used for\ntraining). Readout neurons could be trained to turn a high-dimensional attractor on or\noff in response to particular signals in 2 of the 4 independent input streams (Fig. 2D).\nThe target value for K(x(t)) during training was the currently desired activity-state of the\nhigh-dimensional attractor, where x(t) resulted from giving already tentative spike trains\nthat matched this target value as feedback into the circuit. These neurons were trained\nto represent in their \ufb01ring activity at any time the information in which of input streams\n1 or 2 a burst had most recently occurred. If it occurred most recently in stream 1, they\nwere trained to \ufb01re at 40 Hz, and not to \ufb01re otherwise. Thus these neurons were required\nto represent the non-fading state of a very simple \ufb01nite state machine, demonstrating in a\nsimple example the validity of Theorem 2.2.\n\nThe weights w of these readout neurons were determined by a sign-constrained linear\nregression, so that weights from excitatory (inhibitory) presynaptic neurons were automat-\nically positive (negative). Since these readout neurons had the same properties as neurons\nwithin the circuit, this computer simulation also provided a \ufb01rst indication of the gain in\nreal-time processing capability that can be achieved by suitable training of a few spiking\nneurons within an otherwise randomly connected recurrent circuit. Fig. 2 shows that other\nreadouts from the same circuit (that do not provide feedback) can be trained to amplify their\nresponse to one of the input streams (Fig. 2E), or even switch their computational function\n(Fig. 2F) if the high-dimensional attractor is in the on-state, thereby providing a model for\nthe way in which internal circuit states can change the \u201cprogram\u201d for its online processing.\n\nContinuous high-dimensional attractors that hold a time-varying analog value (instead of\na discrete state) through globally distributed activity within the circuit can be created in\nthe same way through feedback. In fact, several such high-dimensional attractors can co-\nexist within the same circuit, see Fig. 3B,C,D. This gives rise to a model (Fig. 3) that\ncould explain how timing of behavior and reward expectation are learnt and controlled by\nneural microcircuits on a behaviorally relevant large time scale. In addition Fig. 4 shows\nthat a continuous high-dimensional attractor that is created through feedback provides a\nnew model for a neural integrator, and that the current value of this neural integrator can\nbe combined within the same circuit and in real-time with variables extracted from time-\nvarying analog input streams.\n\nThis learning-induced generation of high-dimensional attractors through feedback provides\na new model for the emergence of persistent \ufb01ring in cortical circuits that does not rely\non especially constructed circuits, neurons, or synapses, and which is consistent with high\nnoise (see Fig. 4G for the quite realistic trial-to-trial variability in this circuit of HH neurons\nwith background noise according to [8]). This learning based model is also consistent with\nthe surprising plasticity that has recently been observed even in quite specialized neural\nintegrators [11]. Its robustness can be traced back to the fact that readouts can be trained to\ncorrect errors in their previous feedback. Furthermore such error correction is not restricted\nto linear computational operations, since the inherent kernel property of generic recurrent\ncircuits allows even linear readouts to carry out nonlinear computations on \ufb01ring rates\n\n\f(Fig. 2G). Whereas previous models for discrete or continuous attractors in recurrent neural\ncircuits required that the whole dynamics of such circuit was entrained by the attractor, our\nnew model predicts that persistent \ufb01ring states can co-exist with other high-dimensional\nattractors and with responses to time-varying afferent inputs within the same circuit. Note\nthat such attractors can equivalently be generated by training (instead of readouts) a few\nneurons within an otherwise generic cortical microcircuit model.\n\nFigure 3: Representation of time for behaviorally relevant time spans in a generic cortical microcir-\ncuit model. (A) Afferent circuit input, consisting of a cue in one channel (gray) and random spikes\n(freshly drawn for each trial) in the other channels. (B) Response of 100 neurons from the same\ncircuit as in Fig. 2, which has here two co-existing high-dimensional attractors. The autonomously\ngenerated periodic bursts with a periodic frequency of about 8 Hz are not related to the task, and\nreadouts were trained to become invariant to them. (C and D) Feedback from two linear readouts\nthat were simultaneously trained to create and control two high-dimensional attractors. One of them\nwas trained to decay in 400 ms (C), and the other in 600 ms (D) (scale in nA is the average current\ninjected by feedback into a randomly chosen subset of neurons in the circuit). (E) Response of the\nsame neurons as in (B), for the same circuit input, but with feedback from a different linear readout\nthat was trained to create a high-dimensional attractor that increases its activity and reaches a plateau\n600 ms after the occurrence of the cue in the input stream. (F) Feedback from the linear readout that\ncreates this continuous high-dimensional attractor.\n\n4 Discussion\n\nWe have demonstrated that persistent memory and online switching of real-time process-\ning can be implemented in generic cortical microcircuit models by training a few neurons\n\n\fFigure 4: A model for analog real-time computation on external and internal variables in a generic\n(A and B) Two input\ncortical microcircuit (consisting of 600 conductance-based HH neurons).\nstreams as in Fig. 2; their \ufb01ring rates r1(t); r2(t) are shown in (B). (C) Resulting \ufb01ring activity\nof 100 neurons in the circuit. (D) Performance of a neural integrator, generated by feedback from\na linear readout that was trained to output at any time t an approximation CA(t) of the integral\nR t\n0 (r1(s) \u00a1 r2(s))ds over the difference of both input rates. Feedback values were injected as input\ncurrents into a randomly chosen subset of neurons in the circuit. Scale in nA shows average strength\nof feedback currents (also in panel H). (E) Performance of linear readout that was trained to output 0\nas long as CA(t) stayed below 1.35 nA, and to output then r2(t) until the value of CA(t) dropped\nbelow 0.45 nA (i.e., in this test run during the shaded time periods). (F) Performance of linear read-\nout trained to output r1(t) \u00a1 CA(t), i.e. a combination of external and internal variables, at any time\nt (both r1 and CA normalized into the range [0; 1]). (G) Response of a randomly chosen neuron in\nthe circuit for 10 repetitions of the same experiment (with input spike trains generated by Poisson\nprocesses with the same time-course of \ufb01ring rates), showing biologically realistic trial-to-trial vari-\nability. (H) Activity traces of a continuous attractor as in (D), but in 8 different trials for 8 different\n\ufb01xed values of r1 and r2 (shown on the right). The resulting traces are very similar to the temporal\nevolution of \ufb01ring rates of neurons in area LIP that integrate sensory evidence (see Fig.5A in [12]).\n\n(within or outside of the circuit) through very simple learning processes (linear regression,\nor alternatively \u2013 with some loss in performance \u2013 perceptron learning). The resulting high-\ndimensional attractors can be made noise-robust through training, thereby overcoming the\ninherent brittleness of constructed attractors. The high dimensionality of these attractors,\n\n\fwhich is caused by the small number of synaptic weights that are \ufb01xed for their creation,\nallows the circuit state to move in or out of other attractors, and to absorb new information\nfrom online inputs, while staying within such high-dimensional attractor. The resulting\nvirtually unlimited computational capability of fading memory circuits with feedback can\nbe explained on the basis of the theoretical results that were presented in section 2.\n\nAcknowledgments\n\nHelpful comments from Wulfram Gerstner, Stefan Haeusler, Herbert Jaeger, Konrad Ko-\nerding, Henry Markram, Gordon Pipa, Misha Tsodyks, and Tony Zador are gratefully ac-\nknowledged. Written under partial support by the Austrian Science Fund FWF, project #\nS9102-N04, project # IST2002-506778 (PASCAL) and project # FP6-015879 (FACETS)\nof the European Union.\n\nReferences\n\n[1] W. Maass, T. Natschl\u00a8ager, and H. Markram. Real-time computing without stable\nstates: A new framework for neural computation based on perturbations. Neural\nComputation, 14(11):2531\u20132560, 2002.\n\n[2] H. J\u00a8ager and H. Haas. Harnessing nonlinearity: predicting chaotic systems and saving\n\nenergy in wireless communication. Science, 304:78\u201380, 2004.\n\n[3] P. Joshi and W. Maass. Movement generation with circuits of spiking neurons. Neural\n\nComputation, 17(8):1715\u20131738, 2005.\n\n[4] W. Maass, P. Joshi, and E. D. Sontag. Computational aspects of feedback in neu-\nsubmitted for publication, 2005. Online available as #168 from\n\nral circuits.\nhttp://www.igi.tugraz.at/maass/.\n\n[5] M. S. Branicky. Universal computation and other capabilities of hybrid and continu-\n\nous dynamical systems. Theoretical Computer Science, 138:67\u2013100, 1995.\n\n[6] M. Casey. The dynamics of discrete-time computation with application to recurrent\nneural networks and \ufb01nite state machine extraction. Neural Computation, 8:1135 \u2013\n1178, 1996.\n\n[7] W. Maass and P. Orponen. On the effect of analog noise in discrete-time analog\n\ncomputations. Neural Computation, 10:1071\u20131095, 1998.\n\n[8] A. Destexhe, M. Rudolph, and D. Pare. The high-conductance state of neocortical\n\nneurons in vivo. Nat. Rev. Neurosci., 4(9):739\u2013751, 2003.\n\n[9] H. Markram, Y. Wang, and M. Tsodyks. Differential signaling via the same axon of\n\nneocortical pyramidal neurons. PNAS, 95:5323\u20135328, 1998.\n\n[10] A. Gupta, Y. Wang, and H. Markram. Organizing principles for a diversity of\nGABAergic interneurons and synapses in the neocortex. Science, 287:273\u2013278, 2000.\n[11] G. Major, R. Baker, E. Aksay, B. Mensh, H. S. Seung, and D. W. Tank. Plasticity and\ntuning by visual feedback of the stability of a neural integrator. Proc Natl Acad Sci,\n101(20):7739\u20137744, 2004.\n\n[12] M. E. Mazurek, J. D. Roitman, J. Ditterich, and M. N. Shadlen. A role for neural\nintegrators in perceptual decision making. Cerebral Cortex, 13(11):1257\u20131269, 2003.\n\n\f", "award": [], "sourceid": 2864, "authors": [{"given_name": "Wolfgang", "family_name": "Maass", "institution": null}, {"given_name": "Prashant", "family_name": "Joshi", "institution": null}, {"given_name": "Eduardo", "family_name": "Sontag", "institution": null}]}