{"title": "Variational Learning for Recurrent Spiking Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 136, "page_last": 144, "abstract": "We derive a plausible learning rule updating the synaptic efficacies for feedforward, feedback and lateral connections between observed and latent neurons. Operating in the context of a generative model for distributions of spike sequences, the learning mechanism is derived from variational inference principles. The synaptic plasticity rules found are interesting in that they are strongly reminiscent of experimentally found results on Spike Time Dependent Plasticity, and in that they differ for excitatory and inhibitory neurons. A simulation confirms the method's applicability to learning both stationary and temporal spike patterns.", "full_text": "Variational Learning for Recurrent Spiking Networks\n\nDanilo Jimenez Rezende\n\nBrain Mind Institute\n\n\u00b4Ecole Polytechnique F\u00b4ed\u00b4erale de Lausanne\n\n1015 Lausanne EPFL, Switzerland\ndanilo.rezende@epfl.ch\n\nSchool of Computer and Communication Sciences, Brain Mind Institute\n\nDaan Wierstra\n\n\u00b4Ecole Polytechnique F\u00b4ed\u00b4erale de Lausanne\n\n1015 Lausanne EPFL, Switzerland\n\ndaan.wierstra@epfl.ch\n\nSchool of Computer and Communication Sciences, Brain Mind Institute\n\nWulfram Gerstner\n\n\u00b4Ecole Polytechnique F\u00b4ed\u00b4erale de Lausanne\n\n1015 Lausanne EPFL, Switzerland\nwulfram.gerstner@epfl.ch\n\nAbstract\n\nWe derive a plausible learning rule for feedforward, feedback and lateral connec-\ntions in a recurrent network of spiking neurons. Operating in the context of a\ngenerative model for distributions of spike sequences, the learning mechanism is\nderived from variational inference principles. The synaptic plasticity rules found\nare interesting in that they are strongly reminiscent of experimental Spike Time\nDependent Plasticity, and in that they differ for excitatory and inhibitory neurons.\nA simulation con\ufb01rms the method\u2019s applicability to learning both stationary and\ntemporal spike patterns.\n\n1\n\nIntroduction\n\nThis study considers whether recurrent networks of spiking neurons can be seen as a generative\nmodel not only of stationary patterns but also of temporal sequences. More precisely, we derive a\nmodel that learns to adapt its spontaneously spike sequences to conform as closely as possible to the\nempirical distribution of actual spike sequences caused by inputs impinging upon the sensory layer\nof the network.\nA generative model is a model of the joint distribution of percepts and hidden causes in the world.\nSince the world has complex temporal relationships, we need a model that is able to both recognize\nand predict temporal patterns. Behavioural studies (e.g., [1]) support the assumption that the brain is\nperforming approximate Bayesian inference. More recently, evidence for this hypothesis was found\nin electro-physiological work as well [2]. Various abstract Bayesian models have been proposed\nto account for this phenomenon [3, 4, 5, 6, 7]. However, it remains an open question whether\noptimization in abstract Bayesian models can be translated into plausible learning rules for synapses\nin networks of spiking neurons.\nIn this paper, we show that the derivation of spike-based plasticity rules from statistical learning\nprinciples yields learning dynamics for a generative spiking network model which are akin to those\n\n1\n\n\fFigure 1: A network of spiking neurons, divided into observed and latent pools of neurons.\n\nof Spike-Time Dependent Plasticity (STDP) [8]. Our learning rule is derived from a variational\noptimization process. Typically, optimization in recurrent Bayesian networks involves both forward\nand backward propagation steps. We propose a plasticity rule that approximates backward steps by\nthe introduction of delayed updates in the synaptic weights and dynamics. The theory is supported\nby simulations in which we demonstrate that the learning mechanism is able to capture the hidden\ncauses behind the observed spiking patterns.\nWe use the Spike Response Model (SRM) [9, 10], in which spikes are generated stochastically de-\npending on the neuronal membrane potential. The SRM is an example of a generalized linear model\n(GLM). It is closely related to the integrate-and-\ufb01re model, and has been successfully used to ex-\nplain neuronal spike trains [11, 12]. In this model, the membrane potential of a neuron i at time t,\nexpressed as ui(t) is given by\n\n\u2327 \u02d9ui(t) = ui(t) + bi +Xj\n\nWi,jXj(t),\n\n(1)\n\nj 2{t1\n\nj ,...,tN\nj }\n\n(t tf\n\nj , . . . , tN\n\nj ), where {t1\n\nwhere bi is a bias which represents a constant external input to the neuron, and Xj(t) is the spike\ntrain of the jth neuron de\ufb01ned by Xj(t) =Ptf\nj } is the set\nof spike timings. The diagonal elements of the synaptic matrix are kept \ufb01xed to a negative value\nWi,i = \u23180 with \u23180 = 1.0, which implements a reset of the membrane potential after each spike\nand is a simple way to take into account neuronal refractoriness [9, 13]. The time constant is taken\nto be \u2327 = 10ms as in [13]. The spike generation process is stochastic with time-dependent \ufb01ring\nintensity \u21e2i(t) which depends on the membrane potential ui(t):\n\u21e2i(t) = \u21e20 exp (ui(t)) .\n\n(2)\nAn exponential dependence of the \ufb01ring intensity upon the membrane potential agrees with exper-\nimental results [12]. The set of equations (2) and (1) captures the simpli\ufb01ed dynamics of a spiking\nneuron with stochastic spike timing.\nIn the following sections, we will introduce the theoretical framework and the approximations used\nin this paper. The basic learning mechanism is introduced and derived, followed by a simulation\nillustrating that our proposed learning rule is able to learn spatio-temporal features in the input spike\ntrains and reproduce them in its spontaneous activity.\n\n2 Principled Framework\n\nWe consider a network consisting of two distinct sets of neurons, observed neurons ( also called\nvisible neurons or V) and latent neurons ( also called hidden or H), as illustrated in Figure 1. The\nactivities of the observed neurons represent the quantity of interest to be modelled, while the latent\nneurons ful\ufb01ll a mediating role representing the hidden causes of the observed spike train.\nLearning in the context of this neuronal network consists of changing the synaptic strengths between\nneurons. We postulate that the underlying principle behind learning relies on learning distributions\nof spike trains evoked by either sensory inputs or more complicated sequences of cognitive events.\nIn statistics, learning distributions involves minimizing a measure of distance between the model\n(that is, our neuronal network) and a target distribution (e.g. observations). A principled measure of\ndistance between two distributions p and pempirical is the Kullback-Leibler divergence [14] de\ufb01ned as\n\nKL(pempirical||p) =Z DXpempirical(X) log\n\npempirical(X)\n\np(X)\n\n.\n\n(3)\n\n2\n\n\uf001\uf002\uf003\uf004\uf005\uf006\uf004\uf007\uf008\uf009\uf00a\uf004\uf00b\uf00a\fwhere individual X represent entire spike trains. DX is a measure of integration over spike trains.\nOur learning mechanism tries to minimize the KL divergence between the distribution de\ufb01ned by\nour network p(X) and the observed spike timings distribution pempirical that is evoked by an unknown\nexternal process. Note that minimizing the KL divergence entails maximizing the likelihood that the\nobserved spike trains XV could have been generated by the model.\nIn order to derive the learning dynamics of our model in the next section, we need to evaluate the\ngradient of the likelihood (3) with respect to the free parameters of our model, i.e.\nthe synaptic\nef\ufb01cacies Wi,j and biases bi.\nThe joint likelihood of a particular spike train of both the observed XV and the latent neurons XH\nunder our neuronal model can be written as [13]\n\nlog p(XV , XH) = Xi2V[HZ T\n\n0\n\nd\u2327 [log \u21e2i(\u2327 )Xi(\u2327 ) \u21e2i(\u2327 )]\n\n(4)\n\nSince we have a neuronal network including latent units (that is, neurons not receiving external\ninputs), the actual observation likelihood is an effective quantity obtained by integrating over all\npossible latent spike trains XH,\n\np(XV ) =Z DXHp(XV , XH).\n\nThe gradient of (5) is given by an expectation conditioned on the observed neurons\u2019 history:\n\nr log p(XV ) = r logZ DXHp(X) = hr log p(X)ip(XH|XV )\n\nwhere hf (X)ip =R DXf (x)p(x). This is dif\ufb01cult to evaluate since it conditions an entire latent\n\nspike train on an entire observed spike train. In other words, the posterior distribution of spike-\ntimings of the latent neurons depends on both past and future of the observed neurons\u2019 spike train.\n\n2.1 Weak Coupling Approximation\n\nIn order to render the model more tractable, we introduce an approximation on the dynamics based\non the weak coupling approximation [15], which amounts to replacing (1) by\n\n(5)\n\n(6)\n\n(7)\n\n(9)\n\nwhere zi(t) is a Gaussian process with mean zero and inverse variance 1 i(t) given by\n\nWi,j\u21e2j(t) + zj(t),\n\n\u2327 \u02d9ui(t) = ui(t) + bi +Xj\n\u2327 2Xj\n\n(t) = 0 +\n\n1\ni\n\n1\n\nW 2\n\ni,j\u21e2j(t),\n\nwhere 0 is intrinsic noise which we have added to regularize the simulations (we assume 0 = 0.1).\nNote that i(t) is a function of both the network state and synaptic ef\ufb01cacies. Our network model\nde\ufb01nes a joint distribution between observed input spike trains and membrane potentials given by\n\nwhere terms not depending on the model parameters and latent states have been dropped out as they\ndo not contribute to the gradients we are interested in and fi(t) is the drift of the Gaussian process\nof the membrane potentials and can be read from equation (6). It is given by\n\nlog p(XV , u) =Xi2VZ dt [Xi(t)ui(t) \u21e20 exp(ui(t))] Xi2V[HZ dt\nWi,j\u21e2j(t)1A\n\nfi(t) =\n\ni(t)\n\n2\n\n( \u02d9ui(t) fi(t))2,\n\n(8)\n\nt\n\n1The variance of \u02d9u due to the external input can be obtained by noting that ui(t+dt) = ui(t) exp(dt/\u2327 )+\nR t+dt\nds exp((s t dt)/\u2327 )(bi +Pj Wi,jXj(t))/\u2327. Thus, in the weak coupling regime\nV ar(u(t + dt)|u(t)) =Xj\n\u2327 2Xj\n\nds exp(2(s t dt)/\u2327 )(\u21e2j(t))/\u2327 2 =\n\ni,j\u21e2j(t)\n\nW 2\n\nW 2\n\ndt\n\n1\n\n\u2327 0@ui(t) + bi +Xj\ni,jZ t+dt\n\nt\n\n3\n\n\fThe weak coupling approximation amounts to replacing spikes of the latent neurons by intensities\nplus Gaussian noise. Note that in this approximated model, the latent variables are non longer\nthe latent spike trains, but the membrane potentials. However, we emphasize that in the end the\nintensities can be substituted by spikes as we will see below.\n\n2.2 Variational Approximation of the Posterior Membrane Potential p(u|XV )\nThe variational approach in statistics is a method to approximate some complex distribution p by a\nfamily of simpler distributions q . Variational methods have been applied to spiking neural networks\nin many different contexts, such as in connectivity or external source inference [20, 21]. In the\nfollowing, we try to interpret the neural activity and plasticity together as an approximate form of\nvariational learning.\nWe approximate the posterior p(u|XV ) by the Gaussian process\n\nlog q(u) = Xi Z dt\n\ni(t)\n\n2\n\n( \u02d9ui(t) hi(t))2 + c\n\n(10)\n\nwhere the hi(t) are variational parameters representing the drift of the ith membrane potential at\ntime t in the posterior process and c is a normalization constant. Note that the parameters i(t) of\nthe posterior process are taken to be the same as the network dynamics noise in (6). This is necessary\nin order to have a \ufb01nite KL-divergence between the prior and the posterior processes [22].\nFinding a good approximation for the variational parameters hi(t) amounts to minimizing the quan-\ntity KL(q(u) k p(XV , u)), which is given by\n\nKL(q k p) = Z dt*Xi2V\n+ Xi2V[H\n\n2\n\ni(t)\n\n[Xi(t)ui(t) \u21e20 exp(ui(t))]\n\n( \u02d9ui(t) fi(t))2 Xi2V[H\n\ni(t)\n\n2\n\n( \u02d9ui(t) hi(t))2+q(u)\n\n(11)\n\nAlthough (11) can be written analytically in terms of the instantaneous mean and covariance of the\nposterior process, we adopt a simpler mean-\ufb01eld approximation, i.e. hF (ui(t))i \u21e1 F (hui(t)i). We\nwrite the mean hui(t)i = \u00afui(t) as\n\n\u00afui(t) = \u00afui(0) +Z t\n\n0\n\ndshi(s)\n\n(12)\n\nwhere the hi plays the role of the \u2019drift\u2019 or the derivative of \u00afui. Note that \u00afui(t)\nwhere \u21e5(x) is the Heaviside step function. As a result, the KL-divergence becomes\n\nKL(q k p) \u21e1 Z dtXi \u21e2 [Xi(t)\u00afui(t) \u21e20 exp(\u00afui(t))] i2V +\nKL = Z dt [Xk(t) \u21e20 exp(\u00afuk(t))] \u21e5(t t0)k2V\n\nhk(t0)\n\n\n\ni(t)\n\n2\n\nThe drifts hi(t) of the variational approximation can be updated using gradient descent\n\nhj (t0) =\u21e5( t t0)i,j,\n(hi(t) fi(t))2 (13)\n\nwhere\n\n+ k(t0)(hk(t0) fi(t0)) Z dtXi\n2Xi Z dt\n\n(hi(t) fi(t))2,\n\ni(t)\nhk(t0)\n\n+\n\n1\n\ni(t)(hi(t) fi(t))\n\nfi(t)\nhk(t0)\ni(t)\nhk(t0)\n\n=\n\n1\n\u2327\n\n= \n\n(i,k + Wi,k\u21e2k(t)) \u21e5(t t0)\n1\n\u2327 2 2\n\ni,k\u21e2k(t)\u21e5(t t0)\n\ni (t)W 2\n\n4\n\nfi(t)\nhk(t0)\n\n(14)\n\n(15)\n\n(16)\n\n\fFigure 2: Posterior \ufb01ring intensity for two simple networks: (a) A network with 4 neurons, simulated with\nmean \ufb01eld approximation. (b) From top to bottom: the observed spike train, the \ufb01ring intensity for the three\nlatent neurons and the posterior inverse variance. The green neuron has a direct connection to the observed\nneuron, and as such has a much stronger modulation of its \ufb01ring rate than the other two latent neurons. (c) A\nnetwork with two pools of 20 neurons, the observed and the latent pools. (d) Simulation results. From top to\nbottom: observed spike trains, spike trains in the latent pool and mean \ufb01ring intensities of the latent neurons\nover different realizations of the network. The rate of the latent pool increases just before the spikes of the\nobserved neurons. Note that the spiking implementation of the model has the same rates as the mathematical\nrate model.\n\nThere are few key points to note regarding (14). First, in the absence of observations, the best\napproximating hi(t) is simply given by fi(t), that is the posterior and the prior processes become\nequal. Second, the \ufb01rst, third and fourth terms in (14) are backward terms, that is, they correspond\nto corrections in the \u201cbelief\u201d about the past states generated by new inputs. This implies that in\norder to estimate the drift hi(t) of the posterior membrane potential of neuron i at time t, we need to\nknow the observations X(t0) at time t0 > t. Third, the fourth term in equation (14) is a contribution\nto the gradient that comes from the fact that the inverse variance i(t) de\ufb01ned in equation (7) is also\na function of the network state. This is an important feature of the model, since it implies that the\namount of noise in the dynamics is also being adapted to better explain the observed spike trains.\n\n2.3 Towards Spike-time Dependent Plasticity\n\nWe learn the parameters of our network, that is, the synaptic weights and the neural \u2018biases\u2019 by\ngradient descent with learning rate \u2318:\n\nbi\n\n\n(hi(t) fi(t))\n\n\u2327\nk(t)\n\ni(t)\n\n(17)\n\nbi = \u2318\nWk,l = \u2318\n\n(hk(t) fk(t))\u21e2l(t)\n\n\u2327\n\n(hi(t) fi(t))2,\n\n(18)\n\nKL = \u2318Z dt\nKL = \u2318Z dt\n2Xi Z dt\n\ni(t)\nWk,l\n\nWk,l\n1\n\n+\u2318\n\n5\n\nwhere i(t)\nWk,l\ncomputation of b and W can be done purely locally.\n\n= 2 1\n\n\u2327 2 2\n\ni (t)Wi,l\u21e2l(t)k,i. Note that once the posterior drift hi(t) is known, the\n\n\fA long \u2018backward window\u2019 would, of course, be biologically implausible. However, on-line ap-\nproximations to the backward terms provide a reasonable approximation by taking small backwards\n\ufb01lters of up to 50ms. Mechanistically, applications of W can operate with a small delay, which\nis required to calculate the backwards correction term. In biology such delays indeed exist, as the\nweights are switched to a new value only some time after the stimulation that induces the change\n[23, 24]\nMore precisely, using a small backward window amounts to approximating the gradient of the pos-\nterior drift hi(t) by cutting off the time integrals using a \ufb01nite time horizon, i.e., in equation (14)\ndt where T is the size of the \u201cbackward window\u201d used to\n\napproximate the gradient. The expression (14) can now be written as a delayed update equation\n\nt0\n\nwe replace integralR dt byR t0+T\nhk(t T ) / Z t\n Z t\ndsXi\n2Xi Z t\n\ntT\n\ntT\n\n+\n\n1\n\nds [Xk(s) \u21e20 exp(\u00afuk(s))] k2V\n+ k(t T )(hk(t T ) fk(t T ))\n\ntT\n\ni(s)(hi(s) fi(s))\n\nds\n\ni(s)\n\nhk(t T )\n\nfi(s)\n\nhk(t T )\n(hi(s) fi(s))2,\n\n(19)\n\nThe resulting update for the variable hk is used in the learning equation 18.\nThe simulation shown in Figure 2 provides a conceptual illustration of how the posterior \ufb01ring\nintensity \u21e2l(t) propagates information backward from observed into latent neurons, a process that is\nessential for learning temporal patterns. Note that \u21e2l is the \ufb01ring rate of the presynaptic neuron l and\nas such it is information that is not directly available at the site of the synapse which has only access\nto spike arrivals (but not the underlying \ufb01ring rate). However, spike arrivals do provide a reasonable\nestimate of the rate. Indeed Figure 2c and d show that a simulation of a network of pools of spiking\nneurons where updates are only based on spike times (rather than rates) gives qualitatively the same\ninformation as the rate formula derived above.\nIn equations (20,15) we could therefore replace\nthe pre-synaptic \ufb01ring intensity \u21e2j(t) by temporally \ufb01ltered spike trains which constitute a good\napproximation to \u21e2j(t).\n\n2.4 STDP Window\n\n2Xk\n\nFrom our learning equation for the synaptic weight (18), we can extract an STDP-like learning\n\nwindow by rewriting the plasticity rules as Wi,j =R dtWi,j(t), where\n\ni(t)\n\n1\n\nWi,j(t) =\n\n(hi(t) fi(t))\u21e2j(t) +\n\n\u2327\n\nk(t)\nWi,j\n\n(hk(t) fk(t))2\n\n(20)\n\nWi,j(t) is the expected change in Wi,j at time t under the posterior. As before, we replace the\n\ufb01ring intensity \u21e2j in a given trial by the spikes. Assuming a spike of the observed neuron at t = 0,\nwe have evaluated h(t) and f (t) and plot the weight change k(t0)(hk(t0) fk(t0)) that would\noccur if the latent neuron \ufb01res at t0 cf. equation (18). We show the resulting Spike-time Dependent\nPlasticity for a simple network of two neurons in Figure 3.\nNote that the shape of Wi,j(t) is remarkably reminiscent of the experimentally found measure-\nments for STDP [8]. In particular, the shape of the STDP curve depends on the type of neuron\nand is different for connections from excitatory to excitatory than from excitatory to inhibitory or\ninhibitory to inhibitory neurons (Figure 3).\n\n3 Simulations\n\nIn order to demonstrate the method\u2019s ability to capture both stationary and temporal patterns, we\nperformed simulations on two tasks. The \ufb01rst one involves the formation of a temporal chain, while\nthe second one involves a stationary pattern generator. Both simulations were done using a discrete-\ntime (Euler method) version of the equations (14, 17, 18 and 19) with dt = 1ms. The backward\nwindow size was taken to be T = 50ms, and a learning rate of 0.02 was used.\n\n6\n\n\fFigure 3: Spike-time Dependent Plasticity in a simple network composed of two neurons. Weight change\nWi,j(t) (vertical axis) as a function of spike timing of the neuron at the top (the latent neuron), given that the\nbottom (observed) neuron produces a spike at t = 0 (horizontal axis). Shown are all permutations of excitatory\n(e) and inhibitory (i) neuron types, with the left and right learning windows next to each network corresponding\nto the downward and upward synapses, respectively.\n\nThe \ufb01rst task consisted of learning a periodic chain, in which three pools of observed neurons were\nsuccessively activated as shown in Figure 4a. A time lag was introduced between the third and\nthe \ufb01rst pattern so as to force the network to form temporal hidden cause representations that are\ncapable of capturing time dependencies without obvious observable instantaneous clues \u2013 during\na blank moment, the only way a network can tell which pattern comes next is by actively using\nthe latent neurons. After learning, the spontaneously patterns in the observable neurons developed a\nclear resemblance to the patterns provided during training, although a slightly larger amount of noise\nwas present, as shown in Figure 4b. If the noise level of the model network is reduced, a noise-free\n\u201ccleared-up concept\u201d of the observed patterns is generated (Figure 4d) which clearly demonstrates\nthat the recurrent network has indeed learned the task.\nThe way learning has con\ufb01gured the network in the sequence task can be understood if we study the\nconnectivity pattern of the latent neurons. The latent neuron are active during the whole sequence\n(Figure 4c). We have reordered the labels of the neurons so that the structure of the connectivity\nmatrix becomes as visble. There are subsets of latent neurons that are particularly active during each\nof the three \u2019subpatterns\u2019 in the sequence task, and other latent neurons that become active while the\nobservable units are quiescent (Figure 4i). The lateral connectivity between the latent neurons has\nan asymmetry in the forward direction of the chain.\nThe second task aimed at learning to randomly generate one of three statinonary patterns every\n10ms. Successfull learning of this task requires both the learning of the stationary patterns and the\nstochastic transitions between them. Figure 4d\u2013g shows the results on this task.\n\n4 Discussion\n\nSome models have recently been proposed where STDP-like learning rules derive from \u2018\ufb01rst princi-\nples\u2019 (e.g., [25, 26, 13]). However, these models have either dif\ufb01culty dealing with recurrent latent\ndynamics, or they do not account for non-factorial latent representations. In this work, we have pro-\nposed a plausible derivation for synaptic plasticity in a network consisting of spiking neurons, which\ncan both capture time dependencies in observed spike trains and process combinatorial features. Us-\ning a generative model comprising both latent and observed neurons, the mechanism utilizes implicit\n(that is, short-term delayed) backward iterations that arise naturally from variational inference. A\nplasticity mechanism emerges that closely resembles that of the familiar STDP mechanism found\nin experimental studies. In our simulations we show that the plasticity rules are capable of learning\nboth a temporal and a stationary pattern generator. Future work will attempt to further elucidate\nthe possible biological plausibility of the approach, and its connection to Spike-Time Dependent\nPlasticity.\n\nAcknowledgments\n\nSupport was provided by the SNF grant (CRSIK0 122697), the ERC grant (268689) and the Sys-\ntemsX IPhD grant.\n\n7\n\n\uf001\uf001\uf002\uf001\uf002\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf001\uf002\uf003\uf001\uf002\uf003\uf001\uf001\uf002\uf003\uf004\uf005\uf006\fFigure 4: Simulation results. Sequence task a\u2013d, i: a 20ms-periodic sequence with a network of 30 observed\nneurons and 15 latent neurons having 50% of inhibitory neurons (chosen randomly). The connections between\nthe observed neurons have been set to zero in order to illustrate the use of latent-to-latent recurrent connections.\n(a) A sample of the periodic input pattern. Note the long waiting time after each sequence 1 2 3 (1 2 \n3wait123. . . ). (b) Simulations from the network with the \ufb01rst 20ms clamped to the data. (c) Latent\nneurons sample. (d) Sample simulation of the network with the same parameters but with less noise, in order\nto better show the underlying dynamics. This is achieved by the transformation \u21e2i(t) ! \u21e2i(t) with = 2.\nRandom jump task e\u2013h: learning to produce one of three patterns (4ms long) every 10ms. (e) A sample input\npattern (f) One realization from the network with \ufb01rst the 20ms clamped to the data. (g) Sample latent pattern.\n(h) Sample simulation of the network with the same parameters but with less noise. Note that decreasing the\nlevel of noise is actually an impairment in performance for this task. (i) The learned synaptic matrix for the\n\ufb01rst task; the latent neurons have been re-ordered in order show the role of the latent-to-latent synapses in the\ndynamics as well as the role of the latent-to-observed synapses which represent the pattern features.\n\nReferences\n[1] Konrad P K\u00a8ording and Daniel M Wolpert. Bayesian integration in sensorimotor learning. Nature,\n\n427(6971):244\u20137, January 2004.\n\n[2] P. Berkes, G. Orban, M. Lengyel, and J. Fiser. Spontaneous Cortical Activity Reveals Hallmarks of an\n\nOptimal Internal Model of the Environment. Science, 331(6013):83\u201387, January 2011.\n\n[3] Wei Ji Ma, Jeffrey M Beck, and Alexandre Pouget. Spiking networks for Bayesian inference and choice.\n\nCurrent opinion in neurobiology, 18(2):217\u201322, April 2008.\n\n[4] Joshua B Tenenbaum, Thomas L Grif\ufb01ths, and Charles Kemp. Theory-based Bayesian models of induc-\n\ntive learning and reasoning. Trends in cognitive sciences, 10(7):309\u201318, 2006.\n\n[5] Konrad P K\u00a8ording and Daniel M Wolpert. Bayesian decision theory in sensorimotor control. Trends in\n\ncognitive sciences, 10(7):319\u201326, July 2006.\n\n[6] D. Acuna and P. Schrater. Bayesian modeling of human sequential decision-making on the multi-armed\nbandit problem. In Proceedings of the 30th Annual Conference of the Cognitive Science Society. Wash-\nington, DC: Cognitive Science Society, 2008.\n\n[7] Michael D. Lee. A Hierarchical Bayesian Model of Human Decision-Making on an Optimal Stopping\n\nProblem. Cognitive Science, 30(3):1\u201326, May 2006.\n\n8\n\n\f[8] G. Bi and M. Poo. Synaptic modi\ufb01cation by correlated activity: Hebb\u2019s postulate revisited. Annual review\n\nof neuroscience, 24(1):139\u2013166, 2001.\n\n[9] W. Gerstner and W. K. Kistler. Mathematical Formulations of Hebbian Learning. Biological Cybernetics,\n\n87(5-6):404\u2013415, 2002. article.\n\n[10] W. Gerstner. Spike-response model. Scholarpedia, 3(12):1343, 2008.\n[11] J W Pillow, J Shlens, L Paninski, A Sher, A M Litke, E J Chichilnisky, and E P Simoncelli. Spatio-\ntemporal correlations and visual signaling in a complete neuronal population. Nature, 454(7206):995\u2013\n999, Aug 2008.\n\n[12] Renaud Jolivet, Alexander Rauch, Hans R. L\u00a8uscher, and Wulfram Gerstner. Predicting spike timing of\nneocortical pyramidal neurons by simple threshold models. J Comput Neurosci, 21(1):35\u201349, August\n2006.\n\n[13] J.P. P\ufb01ster, Taro Toyoizumi, D. Barber, and W. Gerstner. Optimal spike-timing-dependent plasticity for\n\nprecise action potential \ufb01ring in supervised learning. Neural Computation, 18(6):1318\u20131348, 2006.\n\n[14] S. Kullback and R. A. Leibler. On Information and Suf\ufb01ciency. The Annals of Mathematical Statistics,\n\n22(1):79\u201386, March 1951.\n\n[15] Taro Toyoizumi, Kamiar Rahnama Rad, and Liam Paninski. Mean-\ufb01eld approximations for coupled\npopulations of generalized linear model spiking neurons with Markov refractoriness. Neural computation,\n21(5):1203\u201343, May 2009.\n\n[16] Brendan J. Frey and Geoffrey E. Hinton. Variational Learning in Nonlinear Gaussian Belief Networks.\n\nNeural Computation, 11(1):193\u2013213, January 1999.\n\n[17] Karl Friston, J\u00b4er\u00b4emie Mattout, Nelson Trujillo-Barreto, John Ashburner, and Will Penny. Variational free\n\nenergy and the Laplace approximation. NeuroImage, 34(1):220\u201334, January 2007.\n\n[18] Matthew J Beal and Zoubin Ghahramani. Variational Bayesian Learning of Directed Graphical Models\n\nwith Hidden Variables. Bayesian Analysis, 1(4):793\u2013832, 2006.\n\n[19] T.S. Jaakkola and M.I. Jordan. Bayesian parameter estimation via variational methods. Statistics and\n\nComputing, 10(1):25\u201337, 2000.\n\n[20] Jayant E Kulkarni and Liam Paninski. Common-input models for multiple neural spike-train data. Net-\n\nwork (Bristol, England), 18(4):375\u2013407, December 2007.\n\n[21] Ian H Stevenson, James M Rebesco, Nicholas G Hatsopoulos, Zach Haga, Lee E Miller, and Konrad P\nK\u00a8ording. Bayesian inference of functional connectivity and network structure from spikes. IEEE trans-\nactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in\nMedicine and Biology Society, 17(3):203\u201313, June 2009.\n\n[22] C. Archambeau, Dan Cornford, Manfred Opper, and J. Shawe-Taylor. Gaussian process approximations\nof stochastic differential equations. In Journal of Machine Learning Research Workshop and Conference\nProceedings, volume 1, pages 1\u201316. Citeseer, 2007.\n\n[23] Daniel H O\u2019Connor, Gayle M Wittenberg, and Samuel S-H Wang. Graded bidirectional synaptic plasticity\nis composed of switch-like unitary events. Proceedings of the National Academy of Sciences of the United\nStates of America, 102(27):9679\u201384, July 2005.\n\n[24] C C Petersen, R C Malenka, R a Nicoll, and J J Hop\ufb01eld. All-or-none potentiation at CA3-CA1 synapses.\nProceedings of the National Academy of Sciences of the United States of America, 95(8):4732\u20137, April\n1998.\n\n[25] Rajesh P N Rao. Bayesian computation in recurrent neural circuits. Neural computation, 16(1):1\u201338,\n\nJanuary 2004.\n\n[26] Bernhard Nessler, Michael Pfeiffer, and Wolfgang Maass. STDP enables spiking neurons to detect hidden\ncauses of their inputs. Advances in Neural Information Processing Systems (NIPS09), pages 1357\u20131365,\n2009.\n\n9\n\n\f", "award": [], "sourceid": 126, "authors": [{"given_name": "Danilo", "family_name": "Rezende", "institution": null}, {"given_name": "Daan", "family_name": "Wierstra", "institution": null}, {"given_name": "Wulfram", "family_name": "Gerstner", "institution": null}]}