{"title": "Learning optimal spike-based representations", "book": "Advances in Neural Information Processing Systems", "page_first": 2285, "page_last": 2293, "abstract": "How do neural networks learn to represent information? Here, we address this question by assuming that neural networks seek to generate an optimal population representation for a fixed linear decoder. We define a loss function for the quality of the population read-out and derive the dynamical equations for both neurons and synapses from the requirement to minimize this loss. The dynamical equations yield a network of integrate-and-fire neurons undergoing Hebbian plasticity. We show that, through learning, initially regular and highly correlated spike trains evolve towards Poisson-distributed and independent spike trains with much lower firing rates. The learning rule drives the network into an asynchronous, balanced regime where all inputs to the network are represented optimally for the given decoder. We show that the network dynamics and synaptic plasticity jointly balance the excitation and inhibition received by each unit as tightly as possible and, in doing so, minimize the prediction error between the inputs and the decoded outputs. In turn, spikes are only signalled whenever this prediction error exceeds a certain value, thereby implementing a predictive coding scheme. Our work suggests that several of the features reported in cortical networks, such as the high trial-to-trial variability, the balance between excitation and inhibition, and spike-timing dependent plasticity, are simply signatures of an efficient, spike-based code.", "full_text": "Learning optimal spike-based representations\n\nRalph Bourdoukan\u2217\nGroup for Neural Theory\n\u00b4Ecole Normale Sup\u00b4erieure\n\nParis, France\n\nDavid G.T. Barrett\u2217\nGroup for Neural Theory\n\u00b4Ecole Normale Sup\u00b4erieure\n\nParis, France\n\nralph.bourdoukan@ens.fr\n\ndavid.barrett@ens.fr\n\nChristian K. Machens\n\nChampalimaud Neuroscience Programme\nChampalimaud Centre for the Unknown\n\nchristian.machens@neuro.fchampalimaud.org\n\nLisbon, Portugal\n\nSophie Den`eve\n\nGroup for Neural Theory\n\u00b4Ecole Normale Sup\u00b4erieure\n\nParis, France\n\nsophie.deneve@ens.fr\n\nAbstract\n\nHow can neural networks learn to represent information optimally? We answer\nthis question by deriving spiking dynamics and learning dynamics directly from\na measure of network performance. We \ufb01nd that a network of integrate-and-\ufb01re\nneurons undergoing Hebbian plasticity can learn an optimal spike-based repre-\nsentation for a linear decoder. The learning rule acts to minimise the membrane\npotential magnitude, which can be interpreted as a representation error after learn-\ning. In this way, learning reduces the representation error and drives the network\ninto a robust, balanced regime. The network becomes balanced because small rep-\nresentation errors correspond to small membrane potentials, which in turn results\nfrom a balance of excitation and inhibition. The representation is robust because\nneurons become self-correcting, only spiking if the representation error exceeds a\nthreshold. Altogether, these results suggest that several observed features of cor-\ntical dynamics, such as excitatory-inhibitory balance, integrate-and-\ufb01re dynamics\nand Hebbian plasticity, are signatures of a robust, optimal spike-based code.\n\nA central question in neuroscience is to understand how populations of neurons represent informa-\ntion and how they learn to do so. Usually, learning and information representation are treated as two\ndifferent functions. From the outset, this separation seems like a good idea, as it reduces the prob-\nlem into two smaller, more manageable chunks. Our approach, however, is to study these together.\nThis allows us to treat learning and information representation as two sides of a single mechanism,\noperating at two different timescales.\nExperimental work has given us several clues about the regime in which real networks operate in\nthe brain. Some of the most prominent observations are: (a) high trial-to-trial variability\u2014a neu-\nron responds differently to repeated, identical inputs [1, 2]; (b) asynchronous \ufb01ring at the network\nlevel\u2014spike trains of different neurons are at most very weakly correlated [3, 4, 5]; (c) tight balance\nof excitation and inhibition\u2014every excitatory input is met by an inhibitory input of equal or greater\nsize [6, 7, 8] and (4) spike-timing-dependent plasticity (STDP)\u2014the strength of synapses change as\na function of presynaptic and postsynaptic spike times [9].\nPreviously, it has been shown that observations (a)\u2013(c) can be understood as signatures of an optimal,\nspike-based code [10, 11]. The essential idea is to derive spiking dynamics from the assumption that\nneurons only \ufb01re if their spike improves information representation. Information in a network may\n\n\u2217Authors contributed equally\n\n1\n\n\foriginate from several possible sources: external sensory input, external neural network input, or\nalternatively, it may originate within the network itself as a memory, or as a computation. Whatever\nthe source, this initial assumption leads directly to the conclusion that a network of integrate-and-\ufb01re\nneurons can optimally represent a signal while exhibiting properties (a)\u2013(c).\nA major problem with this framework is that network connectivity must be completely speci\ufb01ed a\npriori, and requires the tuning of N 2 parameters, where N is the number of neurons in the network.\nAlthough this is feasible mathematically, it is unclear how a real network could tune itself into this\noptimal regime. In this work, we solve this problem using a simple synaptic learning rule. The key\ninsight is that the plasticity rule can be derived from the same basic principle as the spiking rule in\nthe earlier work\u2014namely, that any change should improve information representation.\nSurprisingly, this can be achieved with a local, Hebbian learning rule, where synaptic plasticity\nis proportional to the product of presynaptic \ufb01ring rates with post-synaptic membrane potentials.\nSpiking and synaptic plasticity then work hand in hand towards the same goal: the spiking of a\nneuron decreases the representation error on a fast time scale, thereby giving rise to the actual\npopulation representation; synaptic plasticity decreases the representation error on a slower time\nscale, thereby improving or maintaining the population representation. For a large set of initial\nconnectivities and spiking dynamics, neural networks are driven into a balanced regime, where\nexcitation and inhibition cancel each other and where spike trains are asynchronous and irregular.\nFurthermore, the learning rule that we derive reproduces the main features of STDP (property (d)\nabove). In this way, a network can learn to represent information optimally, with synaptic, neural\nand network dynamics consistent with those observed experimentally.\n\n1 Derivation of the learning rule for a single neuron\n\nWe begin by deriving a learning rule for a single neuron with an autapse (a self-connection) (Fig.\n1A). Our approach is to derive synaptic dynamics for the autapse and spiking dynamics for the\nneuron such that the neuron learns to optimally represent a time-varying input signal. We will derive\na learning rule for networks of neurons later, after we have developed the fundamental concepts for\nthe single neuron case.\nOur \ufb01rst step is to derive optimal spiking dynamics for the neuron, so that we have a target for our\nlearning rule. We do this by making two simple assumptions [11]. First, we assume that the neuron\ncan provide an estimate or read-out \u02c6x(t) of a time-dependent signal x(t) by \ufb01ltering its spike train\no(t) as follows:\n\nspike train can be written as o(t) =(cid:80)\n\n\u02d9\u02c6x(t) = \u2212\u02c6x(t) + \u0393o(t),\n(1)\nwhere \u0393 is a \ufb01xed read-out weight, which we will refer to as the neuron\u2019s \u201coutput kernel\u201d and the\ni \u03b4(t \u2212 ti), where {ti} are the spike times. Next, we assume\nthat the neuron only produces a spike if that spike improves the read-out, where we measure the\nread-out performance through a simple squared-error loss function:\n\nL(t) =(cid:0)x(t) \u2212 \u02c6x(t)(cid:1)2\n\n.\n\n(2)\n\nWith these two assumptions, we can now derive optimal spiking dynamics. First, we observe that if\nthe neuron produces an additional spike at time t, the read-out increases by \u0393, and the loss function\nbecomes L(t|spike) = (x(t) \u2212 (\u02c6x(t) + \u0393))2. This allows us to restate our spiking rule as follows:\nthe neuron should only produce a spike if L(t|no spike) > L(t|spike), or (x(t) \u2212 \u02c6x(t))2 > (x(t) \u2212\n(\u02c6x(t) + \u0393))2. Now, squaring both sides of this inequality, de\ufb01ning V (t) \u2261 \u0393(x(t) \u2212 \u02c6x(t)) and\nde\ufb01ning T \u2261 \u03932/2 we \ufb01nd that the neuron should only spike if:\n\nV (t) > T.\n\n(3)\n\nWe interpret V (t) to be the membrane potential of the neuron, and we interpret T as the spike\nthreshold. This interpretation allows us to understand the membrane potential functionally:\nthe\nvoltage is proportional to a prediction error\u2014the difference between the read-out \u02c6x(t) and the actual\nsignal x(t). A spike is an error reduction mechanism\u2014the neuron only spikes if the error exceeds\nthe spike threshold. This is a greedy minimisation, in that the neuron \ufb01res a spike whenever that\naction decreases L(t) without considering the future impact of that spike. Importantly, the neuron\ndoes not require direct access to the loss function L(t).\n\n2\n\n\f\u02d9V = \u2212V + \u0393c \u2212 \u03932o,\n\nTo determine the membrane potential dynamics, we take the derivative of the voltage, which gives\nus \u02d9V = \u0393( \u02d9x \u2212 \u02d9\u02c6x). (Here, and in the following, we will drop the time index for notational brevity.)\nNow, using Eqn. (1) we obtain \u02d9V = \u0393 \u02d9x \u2212 \u0393(\u2212\u02c6x + \u0393o) = \u2212\u0393(x \u2212 \u02c6x) + \u0393( \u02d9x + x) \u2212 \u03932o, so that:\n(4)\nwhere c = \u02d9x + x is the neural input. This corresponds exactly to the dynamics of a leaky integrate-\nand-\ufb01re neuron with an inhibitory autapse1 of strength \u03932, and a feedforward connection strength \u0393.\nThe dynamics and connectivity guarantee that a neuron spikes at just the right times to optimise the\nloss function (Fig. 1B). In addition, it is especially robust to noise of different forms, because of\nits error-correcting nature. If x is constant in time, the voltage will rise up to the threshold T at\nwhich point a spike is \ufb01red, adding a delta function to the spike train o at time t, thereby producing\na read-out \u02c6x that is closer to x and causing an instantaneous drop in the voltage through the autapse,\nby an amount \u03932 = 2T , effectively resetting the voltage to V = \u2212T .\nWe now have a target for learning\u2014we know the connection strength that a neuron must have at the\nend of learning if it is to represent information optimally, for a linear read-out. We can use this target\nto derive synaptic dynamics that can learn an optimal representation from experience. Speci\ufb01cally,\nwe consider an integrate-and-\ufb01re neuron with some arbitrary autapse strength \u03c9. The dynamics of\nthis neuron are given by\n\n\u02d9V = \u2212V + \u0393c \u2212 \u03c9o.\n\n(5)\nThis neuron will not produce the correct spike train for representing x through a linear read-out\n(Eqn. (1)) unless \u03c9 = \u03932.\nOur goal is to derive a dynamical equation for the synapse \u03c9 so that the spike train becomes optimal.\nWe do this by quantifying the loss that we are incurring by using the suboptimal strength, and then\nderiving a learning rule that minimises this loss with respect to \u03c9. The loss function underlying\nthe spiking dynamics determined by Eqn. (5) can be found by reversing the previous membrane\npotential analysis. First, we integrate the differential equation for V , assuming that \u03c9 changes on\ntime scales much slower than the membrane potential. We obtain the following (formal) solution:\n\nV = \u0393x \u2212 \u03c9\u00afo,\n\n(6)\nwhere \u00afo is determined by \u02d9\u00afo = \u2212\u00afo + o. The solution to this latter equation is \u00afo = h\u2217 o, a convolution\nof the spike train with the exponential kernel h(\u03c4 ) = \u03b8(\u03c4 ) exp(\u2212\u03c4 ). As such, it is analogous to the\ninstantaneous \ufb01ring rate of the neuron.\nNow, using Eqn. (6), and rewriting the read-out as \u02c6x = \u0393\u00afo, we obtain the loss incurred by the\nsub-optimal neuron,\n\n(cid:0)V 2 + 2(\u03c9 \u2212 \u03932)\u00afo + (\u03c9 \u2212 \u03932)2\u00afo2(cid:1).\n\nL = (x \u2212 \u02c6x)2 =\n\n1\n\u03932\n\n(7)\n\nWe observe that the last two terms of Eqn. (7) will vanish whenever \u03c9 = \u03932, i.e., when the optimal\nreset has been found. We can therefore simplify the problem by de\ufb01ning an alternative loss function,\n\nLV =\n\n1\n2\n\nV 2,\n\n(8)\n\nwhich has the same minimum as the original loss (V = 0 or x = \u02c6x, compare Eqn. (2)), but yields a\nsimpler learning algorithm. We can now calculate how changes to \u03c9 affect LV :\n\n\u2202LV\n\u2202\u03c9\n\n= V\n\n\u2202V\n\u2202\u03c9\n\n= \u2212V \u00afo \u2212 V \u03c9\n\n\u2202 \u00afo\n\u2202\u03c9\n\n.\n\n(9)\n\nWe can ignore the last term in this equation (as we will show below). Finally, using simple gradient\ndescent, we obtain a simple Hebbian-like synaptic plasticity rule:\n\n\u03c4 \u02d9\u03c9 = \u2212 \u2202LV\n\u2202\u03c9\n\n= V \u00afo,\n\n(10)\n\nwhere \u03c4 is the learning time constant.\n\n1This contribution of the autapse can also be interpreted as the reset of an integrate-and-\ufb01re neuron. Later,\n\nwhen we generalise to networks of neurons, we shall employ this interpretation.\n\n3\n\n\fThis synaptic learning rule is capable of learning the synaptic weight \u03c9 that minimises the difference\nbetween x and \u02c6x (Fig. 1B). During learning, the synaptic weight changes in proportion to the post-\nsynaptic voltage V and the pre-synaptic \ufb01ring rate \u00afo (Fig. 1C). As such, this is a Hebbian learning\nrule. Of course, in this single neuron case, the pre-synaptic neuron and post-synaptic neuron are the\nsame neuron. The synaptic weight gradually approaches its optimal value \u03932. However, it never\ncompletely stabilises, because learning never stops as long as neurons are spiking.\nInstead, the\nsynapse oscillates closely about the optimal value (Fig. 1D).\nThis is also a \u201cgreedy\u201d learning rule, similar to the spiking rule, in that it seeks to minimise the error\nat each instant in time, without regard for the future impact of those changes. To demonstrate that the\nsecond term in Eqn. (5) can be neglected we note that the equations for V , \u00afo, and \u03c9 de\ufb01ne a system\nof coupled differential equations that can be solved analytically by integrating between spikes. This\nresults in a simple recurrence relation for changes in \u03c9 from the ith to the (i + 1)th spike,\n\n\u03c9i+1 = \u03c9i +\n\n.\n\n(11)\n\n\u03c9i(\u03c9i \u2212 2T )\n\u03c4 (T \u2212 \u0393c \u2212 \u03c9i)\n\nThis iterative equation has a single stable \ufb01xed point at \u03c9 = 2T = \u03932, proving that the neuron\u2019s\nautaptic weight or reset will approach the optimal solution.\n\n2 Learning in a homogeneous network\n\nWe now generalise our learning rule derivation to a network of N identical, homogeneously con-\nnected neurons. This generalisation is reasonably straightforward because many characteristics of\nthe single neuron case are shared by a network of identical neurons. We will return to the more\ngeneral case of heterogeneously connected neurons in the next section.\nWe begin by deriving optimal spiking dynamics, as in the single neuron case. This provides a target\nfor learning, which we can then use to derive synaptic dynamics. As before, we want our network\nto produce spikes that optimally represent a variable x for a linear read-out. We assume that the\nread-out \u02c6x is provided by summing and \ufb01ltering the spike trains of all the neurons in the network:\n\n\u02d9\u02c6x = \u2212\u02c6x + \u0393o,\n\n(12)\nwhere the row vector \u0393 = (\u0393, . . . , \u0393) contains the read-out weights2 of the neurons and the column\nvector o = (o1, . . . , oN ) their spike trains. Here, we have used identical read-out weights for each\nneuron, because this indirectly leads to homogeneous connectivity, as we will demonstrate.\nNext, we assume that a neuron only spikes if that spike reduces a loss-function. This spiking rule is\nsimilar to the single neuron spiking rule except that this time there is some ambiguity about which\nneuron should spike to represent a signal. Indeed, there are many different spike patterns that provide\nexactly the same estimate \u02c6x. For example, one neuron could \ufb01re regularly at a high rate (exactly like\nour previous single neuron example) while all others are silent. To avoid this \ufb01ring rate ambiguity,\nwe use a modi\ufb01ed loss function, that selects amongst all equivalent solutions, those with the smallest\nneural \ufb01ring rates. We do this by adding a \u2018metabolic cost\u2019 term to our loss function, so that high\n\ufb01ring rates are penalised:\n\n(13)\nwhere \u00b5 is a small positive constant that controls the cost-accuracy trade-off, akin to a regularisation\nparameter.\nEach neuron in the optimal network will seek to reduce this loss function by \ufb01ring a spike. Speci\ufb01-\ncally, the ith neuron will spike whenever L(no spike in i) > L(spike in i). This leads to the follow-\ning spiking rule for the ith neuron:\n(14)\nwhere Vi \u2261 \u0393(x \u2212 \u02c6x) \u2212 \u00b5oi and Ti \u2261 \u03932/2 + \u00b5/2. We can naturally interpret Vi as the membrane\npotential of the ith neuron and Ti as the spiking threshold of that neuron. As before, we can now\nderive membrane potential dynamics:\n\nL = (x \u2212 \u02c6x)2 + \u00b5(cid:107)\u00afo(cid:107)2,\n\nVi > Ti\n\nnetworks. We can see this by calculating the average \ufb01ring rate(cid:80)N\n\n(15)\n2The read-out weights must scale as \u0393 \u223c 1/N so that \ufb01ring rates are not unrealistically small in large\ni=1 \u00afoi/N \u2248 x/(\u0393N ) \u223c O(N/N ) \u223c O(1).\n\n\u02d9V = \u2212V + \u0393T c \u2212 (\u0393T \u0393 + \u00b5I)o,\n\n4\n\n\fwhere I is the identity matrix and \u0393T \u0393 + \u00b5I is the network connectivity. We can interpret the self-\nconnection terms {\u03932 +\u00b5} as voltage resets that decrease the voltage of any neuron that spikes. This\noptimal network is equivalent to a network of identical integrate-and-\ufb01re neurons with homogeneous\ninhibitory connectivity.\nThe network has some interesting dynamical properties. The voltages of all the neurons are largely\nsynchronous, all increasing to the spiking threshold at about the same time3 (Fig. 1F). Nonetheless,\nneural spiking is asynchronous. The \ufb01rst neuron to spike will reset itself by \u03932 + \u00b5, and it will\ninhibit all the other neurons in the network by \u03932. This mechanism prevents neurons from spik-\n\n3The \ufb01rst neuron to spike will be random if there is some membrane potential noise.\n\nwork case (dashed black line, middle panel), as quanti\ufb01ed by D = maxi,j((cid:12)(cid:12)\u2126ij \u2212 \u2126opt\n\nFigure 1: Learning in a single neuron and a homogeneous network. (A) A single neuron represents\nan input signal x by producing an output \u02c6x. (B) During learning, the single neuron output \u02c6x (solid red\nline, top panel) converges towards the input x (blue). Similarly, for a homogeneous network the out-\nput \u02c6x (dashed red line, top panel) converges towards x. Connectivity also converges towards optimal\nconnectivity in both the single neuron case (solid black line, middle panel) and the homogeneous net-\n)\nat each point in time. Consequently, the membrane potential reset (bottom panel) converges towards\nthe optimal reset (green line, bottom panel). Spikes are indicated by blue vertical marks, and are\nproduced when the membrane potential reaches threshold (bottom panel). Here, we have rescaled\ntime, as indicated, for clarity. (C) Our learning rule dictates that the autapse \u03c9 in our single neuron\n(bottom panel) changes in proportion to the membrane potential (top panel) and the \ufb01ring rate (mid-\ndle panel). (D) At the end of learning, the reset \u03c9 \ufb02uctuates weakly about the optimal value. (E) For\na homogeneous network, neurons spike regularly at the start of learning, as shown in this raster plot.\nMembrane potentials of different neurons are weakly correlated. (F) At the end of learning, spiking\nis very irregular and membrane potentials become more synchronous.\n\n(cid:12)(cid:12)2\n\nij\n\n(cid:12)(cid:12)2\n\n/(cid:12)(cid:12)\u2126opt\n\nij\n\n5\n\n0501001502002503003504000.111005010015020025030035040000.5100.6252525.6255050.625100100.625200200.625400400.625(cid:239)2(cid:239)101(cid:239)112.352.4400400.6251.0491.05(cid:239)111.351.42525.6251.771.78(C) end of learning (D) start of learning VO\u03c9VO\u03c9!me$!me$!me$xx(cid:42)V(cid:90)(cid:42)\u02c6xxx(cid:42)V(cid:90)(cid:42)\u02c6xxx(cid:42)V(cid:90)(cid:42)\u02c6xDV(A) (B) 012345125012345125neuron$(F) Vneuron$(E) V!me$!me$!me$!me$\fing synchronously. The population as a whole acts similarly to the single neuron in our previous\nexample. Each neuron \ufb01res regularly, even if a different neuron \ufb01res in every integration cycle.\nThe design of this optimal network requires the tuning of N (N \u2212 1) synaptic parameters. How can\nan arbitrary network of integrate-and-\ufb01re neurons learn this optimum? As before, we address this\nquestion by using the optimal network as a target for learning. We start with an arbitrarily connected\nnetwork of integrate-and-\ufb01re neurons:\n\n(16)\nwhere \u2126 is a matrix of connectivity weights, which includes the resets of the individual neurons.\nAssuming that learning occurs on a slow time scale, we can rewrite this equation as\n\n\u02d9V = \u2212V + \u0393T c \u2212 \u2126o,\n\n(17)\nNow, repeating the arguments from the single neuron derivation, we modify the loss function to\nobtain an online learning rule. Speci\ufb01cally, we set LV = (cid:107)V(cid:107)2/2, and calculate the gradient:\n\nV = \u0393T x \u2212 \u2126\u00afo.\n\nVk\u2126kl\n\n\u2202 \u00afol\n\u2202\u2126ij\n\n.\n\n(18)\n\n(cid:88)\n\nk\n\n\u2202LV\n\u2202\u2126ij\n\n=\n\nVk\n\n\u2202Vk\n\u2202\u2126ij\n\n= \u2212(cid:88)\n\nVk\u03b4ki\u00afoj \u2212(cid:88)\n\nk\n\nkl\n\nWe can simplify this equation considerably by observing that the contribution of the second sum-\nmation is largely averaged out under a wide variety of realistic conditions4. Therefore, it can be\nneglected, and we obtain the following local learning rule:\n\n\u03c4 \u02d9\u2126ij = \u2212 \u2202LV\n\u2202\u2126ij\n\n= Vi\u00afoj.\n\n(19)\n\nThis is a Hebbian plasticity rule, whereby connectivity changes in proportion to the presynaptic\n\ufb01ring rate \u00afoj and post-synaptic membrane potential Vi. We assume that the neural thresholds are set\nto a constant T and that the neural resets are set to their optimal values \u2212T . In the previous section\nwe demonstrated that these resets can be obtained by a Hebbian plasticity rule (Eqn. (10)).\nThis learning rule minimises the difference between the read-out and the signal, by approaching\nthe optimal recurrent connection strengths for the network (Fig. 1B). As in the single neuron case,\nlearning does not stop, so the connection strengths \ufb02uctuate close to their optimal value. During\nlearning, network activity becomes progressively more asynchronous as it progresses towards opti-\nmal connectivity (Fig. 1E, F).\n\n3 Learning in the general case\n\nNow that we have developed the fundamental concepts underlying our learning rule, we can derive\na learning rule for the more general case of a network of N arbitrarily connected leaky integrate-\nand-\ufb01re neurons. Our goal is to understand how such networks can learn to optimally represent a\nJ-dimensional signal x = (x1, . . . , xJ ), using the read-out equation \u02d9x = \u2212x + \u0393o.\nWe consider a network with the following membrane potential dynamics:\n\n\u02d9V = \u2212V + \u0393T c \u2212 \u2126o,\n\n(20)\nwhere c is a J-dimensional input. We assume that this input is related to the signal according to\nc = \u02d9x + x. This assumption can be relaxed by treating the input as the control for an arbitrary\nlinear dynamical system, in which case the signal represented by the network is the output of such a\ncomputation [11]. However, this further generalisation is beyond the scope of this work.\nAs before, we need to identify the optimal recurrent connectivity so that we have a target for learning.\nMost generally, the optimal recurrent connectivity is \u2126opt \u2261 \u0393T \u0393 + \u00b5I. The output kernels of the\nindividual neurons, \u0393i, are given by the rows of \u0393, and their spiking thresholds by Ti \u2261 (cid:107)\u0393i(cid:107)2/2 +\n4From the de\ufb01nition of the membrane potential we can see that Vk \u223c O(1/N ) because \u0393 \u223c 1/N. There-\nk Vk\u03b4ki \u00afoj = Vi \u00afoj \u223c O(1/N ). Therefore, the second term can\nkl Vk\u2126kl\u2202 \u00afol/\u2202\u2126ij (cid:28) O(1/N ). This happens if \u2126kl (cid:28) O(1/N 2) as at the start of learning.\nIt also happens towards the end of learning if the terms {\u2126kl\u2202 \u00afol/\u2202\u2126ij} are weakly correlated with zero mean,\nor if the membrane potentials {Vi} are weakly correlated with zero mean.\n\nfore, the size of the \ufb01rst term in Eqn. (18) is(cid:80)\nbe ignored if(cid:80)\n\n6\n\n\f\u00b5/2. With these connections and thresholds, we \ufb01nd that a network of integrate-and-\ufb01re neurons\nwill produce spike trains in such a way that the loss function L = (cid:107)x \u2212 \u02c6x(cid:107)2 + \u00b5(cid:107)\u00afo(cid:107)2 is minimised,\nwhere the read-out is given by \u02c6x = \u0393\u00afo. We can show this by prescribing a greedy5 spike rule:\na spike is \ufb01red by neuron i whenever L(no spike in i) > L(spike in i) [11]. The resulting spike\ngeneration rule is\n\nwhere Vi \u2261 \u0393T\n\ni (x \u2212 \u02c6x) \u2212 \u00b5\u00afoi is interpreted as the membrane potential.\n\nVi > Ti,\n\n(21)\n\n5Despite being greedy, this spiking rule can generate \ufb01ring rates that are practically identical to the optimal\n\nsolutions: we checked this numerically in a large ensemble of networks with randomly chosen kernels.\n\nFigure 2: Learning in a heterogeneous network. (A) A network of neurons represents an input\nsignal x by producing an output \u02c6x. (B) During learning, the loss L decreases (top panel). The differ-\nence between the connection strengths and the optimal strengths also decreases (middle panel), as\n\nquanti\ufb01ed by the mean difference (solid line), given by D =(cid:13)(cid:13)\u2126 \u2212 \u2126opt(cid:13)(cid:13)2\n(cid:12)(cid:12)2\nmum difference (dashed line), given by maxi,j((cid:12)(cid:12)\u2126ij \u2212 \u2126opt\n\n/(cid:13)(cid:13)\u2126opt(cid:13)(cid:13)2 and the maxi-\n\n/(cid:12)(cid:12)\u2126opt\n\nij\n\n). The mean population \ufb01ring\nrate (solid line, bottom panel) also converges towards the optimal \ufb01ring rate (dashed line, bottom\npanel). (C, E) Before learning, a raster plot of population spiking shows that neurons produce bursts\nof spikes (upper panel). The network output \u02c6x (red line, middle panel) fails to represent x (blue\nline, middle panel). The excitatory input (red, bottom left panel) and inhibitory input (green, bottom\nleft panel) to a randomly selected neuron is not tightly balanced. Furthermore, a histogram of inter-\nspike intervals shows that spiking activity is not Poisson, as indicated by the red line that represents\na best-\ufb01t exponential distribution. (D, F) At the end of learning, spiking activity is irregular and\nPoisson-like, excitatory and inhibitory input is tightly balanced and \u02c6x matches x.\n\nij\n\n(cid:12)(cid:12)2\n\n7\n\n10(cid:239)810(cid:239)610(cid:239)400.5102000400000.20.4150012245(cid:239)101x 10(cid:239)3150012345(cid:239)8(cid:239)40x 10(cid:239)3neuron xx(cid:42)V(cid:90)(cid:42)\u02c6xxx(cid:42)V(cid:90)(cid:42)\u02c6x(C) neuron xx(cid:42)V(cid:90)(cid:42)\u02c6xxx(cid:42)V(cid:90)(cid:42)\u02c6x(D) L\rD\rF\r(B) (A) !me\t\r \u00a0!me\t\r \u00a0!me\t\r \u00a0start of learning end of learning 021.321.5x 10(cid:239)400.5100.1ISI\t\r \u00a0\u0394t\t\r \u00a0!me\t\r \u00a0\u03a1(\u0394t)\t\r \u00a0020.951.3x 10(cid:239)400.5100.4E-\u00ad\u2010I\t\r \u00a0input\t\r \u00a0(E) ISI\t\r \u00a0\u0394t\t\r \u00a0\t\r \u00a0!me\t\r \u00a0\u03a1(\u0394t)\t\r \u00a0(F) E-\u00ad\u2010I\t\r \u00a0input\t\r \u00a0150012345(cid:239)8(cid:239)40x 10(cid:239)3Jx1x\u2026Jx1xT(cid:299)iV(cid:550)i(cid:299)\u02c6Jx1\u02c6x\u2026Jx1x\u2026Jx1xT(cid:299)iV(cid:550)i(cid:299)\u02c6Jx1\u02c6x\u2026Jx1x\u2026Jx1xT(cid:299)iV(cid:550)i(cid:299)\u02c6Jx1\u02c6x\u2026\f\u03c4 \u02d9\u2126ij = Vi\u00afoj.\n\nHow can we learn this optimal connection matrix? As before, we can derive a learning rule by\nminimising the cost function LV = (cid:107)V(cid:107)2/2. This leads to a Hebbian learning rule with the same\nform as before:\n(22)\nAgain, we assume that the neural resets are given by \u2212Ti. Furthermore, in order for this learning rule\nto work, we must assume that the network input explores all possible directions in the J-dimensional\ninput space (since the kernels \u0393i can point in any of these directions). The learning performance\ndoes not critically depend on how the input variable space is sampled as long as the exploration\nis extensive.\nIn our simulations, we randomly sample the input c from a Gaussian white noise\ndistribution at every time step for the entire duration of the learning.\nWe \ufb01nd that this learning rule decreases the loss function L, thereby approaching optimal network\nconnectivity and producing optimal \ufb01ring rates for our linear decoder (Fig. 2B). In this example, we\nhave chosen connectivity that is initially much too weak at the start of learning. Consequently, the\ninitial network behaviour is similar to a collection of unconnected single neurons that ignore each\nother. Spike trains are not Poisson-like, \ufb01ring rates are excessively large, excitatory and inhibitory\ninput is unbalanced and the decoded variable \u02c6x is highly unreliable (Fig. 2C, E). As a result of\nlearning, the network becomes tightly balanced and the spike trains become asynchronous, irregular\nand Poisson-like with much lower rates (Fig. 2D, F). However, despite this apparent variability, the\npopulation representation is extremely precise, only limited by the the metabolic cost and the discrete\nnature of a spike. This learnt representation is far more precise than a rate code with independent\nPoisson spike trains [11].\nIn particular, shuf\ufb02ing the spike trains in response to identical inputs\ndrastically degrades this precision.\n\n4 Conclusions and Discussion\n\nIn population coding, large trial-to-trial spike train variability is usually interpreted as noise [2]. We\nshow here that a deterministic network of leaky integrate-and-\ufb01re neurons with a simple Hebbian\nplasticity rule can self-organise into a regime where information is represented far more precisely\nthan in noisy rate codes, while appearing to have noisy Poisson-like spiking dynamics.\nOur learning rule (Eqn. (22)) has the basic properties of STDP. Speci\ufb01cally, a presynaptic spike\noccurring immediately before a post-synaptic spike will potentiate a synapse, because membrane\npotentials are positive immediately before a postsynaptic spike. Furthermore, a presynaptic spike\noccurring immediately after a post-synaptic spike will depress a synapse, because membrane po-\ntentials are always negative immediately after a postsynaptic spike. This is similar in spirit to the\nSTDP rule proposed in [12], but different to classical STDP, which depends on post-synaptic spike\ntimes [9].\nThis learning rule can also be understood as a mechanism for generating a tight balance between\nexcitatory and inhibitory input. We can see this by observing that membrane potentials after learning\ncan be interpreted as representation errors (projected onto the read-out kernels). Therefore, learning\nacts to minimise the magnitude of membrane potentials. Excitatory and inhibitory input must be\nbalanced if membrane potentials are small, so we can equate balance with optimal information\nrepresentation.\nPrevious work has shown that the balanced regime produces (quasi-)chaotic network dynamics,\nthereby accounting for much observed cortical spike train variability [13, 14, 4]. Moreover, the\nSTDP rule has been known to produce a balanced regime [16, 17]. Additionally, recent theoretical\nstudies have suggested that the balanced regime plays an integral role in network computation [15,\n13]. In this work, we have connected these mechanisms and functions, to conclude that learning this\nbalance is equivalent to the development of an optimal spike-based population code, and that this\nlearning can be achieved using a simple Hebbian learning rule.\n\nAcknowledgements\n\nWe are grateful for generous funding from the Emmy-Noether grant of the Deutsche Forschungs-\ngemeinschaft (CKM) and the Chaire d\u2019excellence of the Agence National de la Recherche (CKM,\nDB), as well as a James Mcdonnell Foundation Award (SD) and EU grants BACS FP6-IST-027140,\nBIND MECT-CT-20095-024831, and ERC FP7-PREDSPIKE (SD).\n\n8\n\n\fReferences\n[1] Tolhurst D, Movshon J, and Dean A (1982) The statistical reliability of signals in single\n\nneurons in cat and monkey visual cortex. Vision Res 23: 775\u2013785.\n\n[2] Shadlen MN, Newsome WT (1998) The variable discharge of cortical neurons: implications\n\nfor connectivity, computation, and information coding. J Neurosci 18(10): 3870\u20133896.\n\n[3] Zohary E, Newsome WT (1994) Correlated neuronal discharge rate and its implication for\n\npsychophysical performance. Nature 370: 140\u2013143.\n\n[4] Renart A, de la Rocha J, Bartho P, Hollender L, Parga N, Reyes A, & Harris, KD (2010) The\n\nasynchronous state in cortical circuits. Science 327, 587\u2013590.\n\n[5] Ecker AS, Berens P, Keliris GA, Bethge M, Logothetis NK, Tolias AS (2010) Decorrelated\n\nneuronal \ufb01ring in cortical microcircuits. Science 327: 584\u2013587.\n\n[6] Okun M, Lampl I (2008) Instantaneous correlation of excitation and inhibition during ongoing\n\nand sensory-evoked activities. Nat Neurosci 11, 535\u2013537.\n\n[7] Shu Y, Hasenstaub A, McCormick DA (2003) Turning on and off recurrent balanced cortical\n\nactivity. Nature 423, 288\u2013293.\n\n[8] Gentet LJ, Avermann M, Matyas F, Staiger JF, Petersen CCH (2010) Membrane potential\ndynamics of GABAergic neurons in the barrel cortex of behaving mice. Neuron 65: 422\u2013435.\n[9] Caporale N, Dan Y (2008) Spike-timing-dependent plasticity: a Hebbian learning rule. Annu\n\nRev Neurosci 31: 25\u201346.\n\n[10] Boerlin M, Deneve S (2011) Spike-based population coding and working memory. PLoS\n\nComput Biol 7, e1001080.\n\n[11] Boerlin M, Machens CK, Deneve S (2012) Predictive coding of dynamic variables in balanced\n\nspiking networks. under review.\n\n[12] Clopath C, B\u00a8using L, Vasilaki E, Gerstner W (2010) Connectivity re\ufb02ects coding: a model of\n\nvoltage-based STDP with homeostasis. Nat Neurosci 13(3): 344\u2013352.\n\n[13] van Vreeswijk C, Sompolinsky H (1998) Chaotic balanced state in a model of cortical circuits.\n\nNeural Comput 10(6): 1321\u20131371.\n\n[14] Brunel N (2000) Dynamics of sparsely connected networks of excitatory and inhibitory neu-\n\nrons. J Comput Neurosci 8, 183\u2013208.\n\n[15] Vogels TP, Rajan K, Abbott LF (2005) Neural network dynamics. Annu Rev Neurosci 28:\n\n357\u2013376.\n\n[16] Vogels TP, Sprekeler H, Zenke F, Clopath C, Gerstner W. (2011) Inhibitory plasticity balances\nexcitation and inhibition in sensory pathways and memory networks. Science 334(6062):1569\u2013\n73.\n\n[17] Song S, Miller KD, Abbott LF (2000) Competitive Hebbian learning through spike-timing-\n\ndependent synaptic plasticity. Nat Neurosci 3(9): 919\u2013926.\n\n9\n\n\f", "award": [], "sourceid": 1121, "authors": [{"given_name": "Ralph", "family_name": "Bourdoukan", "institution": null}, {"given_name": "David", "family_name": "Barrett", "institution": null}, {"given_name": "Sophie", "family_name": "Deneve", "institution": null}, {"given_name": "Christian", "family_name": "Machens", "institution": null}]}