{"title": "Reducing Spike Train Variability: A Computational Theory Of Spike-Timing Dependent Plasticity", "book": "Advances in Neural Information Processing Systems", "page_first": 201, "page_last": 208, "abstract": null, "full_text": " Reducing Spike Train Variability:\n A Computational Theory Of\n Spike-Timing Dependent Plasticity\n\n\n Sander M. Bohte1,2 Michael C. Mozer2\n S.M.Bohte@cwi.nl mozer@cs.colorado.edu\n 1 Dept. Software Engineering 2 Dept. of Computer Science\n CWI, Amsterdam, The Netherlands University of Colorado, Boulder, USA\n\n Abstract\n Experimental studies have observed synaptic potentiation when a\n presynaptic neuron fires shortly before a postsynaptic neuron, and\n synaptic depression when the presynaptic neuron fires shortly af-\n ter. The dependence of synaptic modulation on the precise tim-\n ing of the two action potentials is known as spike-timing depen-\n dent plasticity or STDP. We derive STDP from a simple compu-\n tational principle: synapses adapt so as to minimize the postsy-\n naptic neuron's variability to a given presynaptic input, causing\n the neuron's output to become more reliable in the face of noise.\n Using an entropy-minimization objective function and the biophys-\n ically realistic spike-response model of Gerstner (2001), we simu-\n late neurophysiological experiments and obtain the characteristic\n STDP curve along with other phenomena including the reduction in\n synaptic plasticity as synaptic efficacy increases. We compare our\n account to other efforts to derive STDP from computational princi-\n ples, and argue that our account provides the most comprehensive\n coverage of the phenomena. Thus, reliability of neural response in\n the face of noise may be a key goal of cortical adaptation.\n\n1 Introduction\nExperimental studies have observed synaptic potentiation when a presynaptic neu-\nron fires shortly before a postsynaptic neuron, and synaptic depression when the\npresynaptic neuron fires shortly after. The dependence of synaptic modulation on\nthe precise timing of the two action potentials, known as spike-timing dependent\nplasticity or STDP, is depicted in Figure 1. Typically, plasticity is observed only\nwhen the presynaptic and postsynaptic spikes (hereafter, pre and post) occur within\na 2030 ms time window, and the transition from potentiation to depression is very\nrapid. Another important observation is that synaptic plasticity decreases with in-\ncreased synaptic efficacy. The effects are long lasting, and are therefore referred to\nas long-term potentiation (LTP) and depression (LTD). For detailed reviews of the\nevidence for STDP, see [1, 2].\nBecause these intriguing findings appear to describe a fundamental learning mech-\nanism in the brain, a flurry of models have been developed that focus on different\naspects of STDP, from biochemical models that explain the underlying mechanisms\ngiving rise to STDP [3], to models that explore the consequences of a STDP-like\nlearning rules in an ensemble of spiking neurons [4, 5, 6, 7], to models that pro-\npose fundamental computational justifications for STDP. Most commonly, STDP\n\n\f\nFigure 1: (a) Measuring STDP experimentally: pre-post spike pairs are repeatedly in-\nduced at a fixed interval tpre-post, and the resulting change to the strength of the synapse\nis assessed; (b) change in synaptic strength after repeated spike pairing as a function of\nthe difference in time between the pre and post spikes (data from Zhang et al., 1998). We\nhave superimposed an exponential fit of LTP and LTD.\n\nis viewed as a type of asymmetric Hebbian learning with a temporal dimension.\nHowever, this perspective is hardly a fundamental computational rationale, and\none would hope that such an intuitively sensible learning rule would emerge from a\nfirst-principle computational justification.\nSeveral researchers have tried to derive a learning rule yielding STDP from first\nprinciples. Rao and Sejnowski [8] show that STDP emerges when a neuron attempts\nto predict its membrane potential at some time t from the potential at time t - t.\nHowever, STDP emerges only for a narrow range of t values, and the qualitative\nnature of the modeling makes it unclear whether a quantitative fit can be obtained.\nDayan and H\n ausser [9] show that STDP can be viewed as an optimal noise-removal\nfilter for certain noise distributions. However, even small variation from these noise\ndistributions yield quite different learning rules, and the noise statistics of biological\nneurons are unknown. Eisele (private communication) has shown that an STDP-like\nlearning rule can be derived from the goal of maintaining the relevant connections\nin a network. Chechik [10] is most closely related to the present work. He relates\nSTDP to information theory via maximization of mutual information between input\nand output spike trains. This approach derives the LTP portion of STDP, but fails\nto yield the LTD portion.\nThe computational approach of Chechik (as well as Dayan and H\n ausser) is premised\non a rate-coding neuron model that disregards the relative timing of spikes. It\nseems quite odd to argue for STDP using rate codes: if spike timing is irrelevant\nto information transmission, then STDP is likely an artifact and is not central to\nunderstanding mechanisms of neural computation. Further, as noted in [9], because\nSTDP is not quite additive in the case of multiple input or output spikes that are\nnear in time [11], one should consider interpretations that are based on individual\nspikes, not aggregates over spike trains.\nHere, we present an alternative computational motivation for STDP. We conjecture\nthat a fundamental objective of cortical computation is to achieve reliable neural re-\nsponses, that is, neurons should produce the identical response--both in the number\nand timing of spikes--given a fixed input spike train. Reliability is an issue if neu-\nrons are affected by noise influences, because noise leads to variability in a neuron's\ndynamics and therefore in its response. Minimizing this variability will reduce the\neffect of noise and will therefore increase the informativeness of the neuron's output\nsignal. The source of the noise is not important; it could be intrinsic to a neuron\n(e.g., a noisy threshold) or it could originate in unmodeled external sources causing\nfluctuations in the membrane potential uncorrelated with a particular input.\nWe are not suggesting that increasing neural reliability is the only learning objective.\n\n\f\nIf it were, a neuron would do well to give no response regardless of the input.\nRather, reliability is but one of many objectives that learning tries to achieve. This\nform of unsupervised learning must, of course, be complemented by supervised and\nreinforcement learning that allow an organism to achieve its goals and satisfy drives.\nWe derive STDP from the following computational principle: synapses adapt so as\nto minimize the entropy of the postsynaptic neuron's output in response to a given\npresynaptic input. In our simulations, we follow the methodology of neurophysiolog-\nical experiments. This approach leads to a detailed fit to key experimental results.\nWe model not only the shape (sign and time course) of the STDP curve, but also\nthe fact that potentiation of a synapse depends on the efficacy of the synapse--it\ndecreases with increased efficacy. In addition to fitting these key STDP phenom-\nena, the model allows us to make predictions regarding the relationship between\nproperties of the neuron and the shape of the STDP curve.\nBefore delving into the details of our approach, we attempt to give a basic intu-\nition about the approach. Noise in spiking neuron dynamics leads to variability in\nthe number and timing of spikes. Given a particular input, one spike train might\nbe more likely than others, but the output is nondeterministic. By the entropy-\nminimization principle, adaptation should reduce the likelihood of these other pos-\nsibilities. To be concrete, consider a particular experimental paradigm. In [12], a\npre neuron is identified with a weak synapse to a post neuron, such that the pre is\nunlikely to cause the post to fire. However, the post can be induced to fire via a\nsecond presynaptic connection. In a typical trial, the pre is induced to fire a single\nspike, and with a variable delay, the post is also induced to fire (typically) a single\nspike. To increase the likelihood of the observed post response, other response pos-\nsibilities must be suppressed. With presynaptic input preceding the postsynaptic\nspike, the most likely alternative response is no output spikes at all. Increasing\nthe synaptic connection weight should then reduce the possibility of this alternative\nresponse. With presynaptic input following the postsynaptic spike, the most likely\nalternative response is a second output spike. Decreasing the synaptic connection\nweight should reduce the possibility of this alternative response. Because both of\nthese alternatives become less likely as the lag between pre and post spikes is in-\ncreased, one would expect that the magnitude of synaptic plasticity diminishes with\nthe lag, as is observed in the STDP curve.\nOur approach to reducing response variability given a particular input pattern in-\nvolves computing the gradient of synaptic weights with respect to a differentiable\nmodel of spiking neuron behavior. We use the Spike Response Model (SRM) of [13]\nwith a stochastic threshold, where the stochastic threshold models fluctuations of\nthe membrane potential or the threshold outside of experimental control. For the\nstochastic SRM, the response probability is differentiable with respect to the synap-\ntic weights, allowing us to calculate the entropy gradient with respect to the weights\nconditional on the presented input. Learning is presumed to take a gradient step\nto reduce this conditional entropy. In modeling neurophysiological experiments, we\ndemonstrate that this learning rule yields the typical STDP curve. We can predict\nthe relationship between the exact shape of the STDP curve and physiologically\nmeasurable parameters, and we show that our results are robust to the choice of\nthe few free parameters of the model.\nTwo papers in these proceedings are closely related to our work. They also find\nSTDP-like curves when attempting to maximize an information-theoretic measure--\nthe mutual information between input and output--for a Spike Response Model\n[14, 15]. Bell & Parra [14] use a deterministic SRM model which does not model the\nLTD component of STDP properly. The derivation by Toyoizumi et al. [15] is valid\nonly for an essentially constant membrane potential with small fluctuations. Neither\nof these approaches has succeeded in quantitatively modeling specific experimental\n\n\f\ndata with neurobiologically-realistic timing parameters, and neither explains the\nsaturation of LTD/LTP with increasing weights as we do. Nonetheless, these models\nmake an interesting contrast to ours by suggesting a computational principle of\noptimization of information transmission, as contrasted with our principle of neural\nnoise reduction. Perhaps experimental tests can be devised to distinguish between\nthese competing theories.\n\n2 The Stochastic Spike Response Model\n\nThe Spike Response Model (SRM), defined by Gerstner [13], is a generic integrate-\nand-fire model of a spiking neuron that closely corresponds to the behavior of a\nbiological spiking neuron and is characterized in terms of a small set of easily inter-\npretable parameters [16]. The standard SRM formulation describes the temporal\nevolution of the membrane potential based on past neuronal events, specifically as\na weighted sum of postsynaptic potentials (PSPs) modulated by reset and thresh-\nold effects of previous postsynaptic spiking events. Following [13], the membrane\npotential of cell i at time t, ui(t), is defined as:\n\n ui(t) = (t - ^\n fi) + wij (t - ^\n fi, t - fj), (1)\n ji fj F t\n j\n\nwhere i is the set of inputs connected to neuron i, Ft is the set of times prior to\n j\nt that neuron j has spiked, ^\n fi is the time of the last spike of neuron i, wij is the\nsynaptic weight from neuron j to neuron i, (t - ^\n fi, t - fj) is the PSP in neuron\ni due to an input spike from neuron j at time fj, and (t - ^\n fi) is the refractory\nresponse due to the postsynaptic spike at time ^\n fi. Neuron i fires when the potential\nui(t) exceeds a threshold () from below.\nThe postsynaptic potential is modeled as the differential alpha function in [13],\ndefined with respect to two variables: the time since the most recent postsynaptic\nspike, x, and the time since the presynaptic spike, s:\n 1 s s\n (x, s) = exp - - exp - H(s)H(x - s)+ (2)\n 1 - s \n m s\n m\n s - x x x\n +exp - exp - - exp - H(x)H(s - x) ,\n s m s\nwhere s and m are the rise and decay time-constants of the PSP, and H is the\nHeaviside function. The refractory reset function is defined to be [13]:\n x + x\n (x) = u abs\n absH(abs - x)H(-x) + uabsexp - + usexp - , (3)\n r\n f s\n r r\nwhere uabs is a large negative contribution to the potential to model the absolute\nrefractory period, with duration abs. We smooth this refractory response by a fast\ndecaying exponential with time constant f . The third term in the sum represents\n r\nthe slow decaying exponential recovery of an elevated threshold, us, with time\n r\nconstant s. (Graphs of these and functions can be found in [13].) We made\n r\na minor modification to the SRM described in [13] by relaxing the constraint that\n s = \n r m; smoothing the absolute refractory function is mentioned in [13] but not\nexplicitly defined as we do here. In all simulations presented, abs = 2ms, s = 4\n r m,\nand f = 0.1\n r m.\nThe SRM we just described is deterministic. Gerstner [13] introduces a stochas-\ntic variant of the SRM (sSRM) by incorporating the notion of a stochastic firing\nthreshold: given membrane potential ui(t), the probability density of the neuron\nfiring at time t is specified by (ui(t)). Herrmann & Gerstner [17] find that then\nfor a realistic escape-rate noise model the firing probability density as a function of\nthe potential is initially small and constant, transitioning to asymptotically linear\n\n\f\nincreasing around threshold . In our simulations, we use such a function:\n \n (v) = (ln[1 + exp(( - v))] - ( - v)), (4)\n \nwhere is the firing threshold in the absence of noise, determines the abruptness of\nthe constant-to-linear probability density transition around , and determines the\nslope of the increasing part. Experiments with sigmoidal and exponential density\nfunctions were found to not qualitatively affect the results.\n\n3 Minimizing Conditional Entropy\n\nWe now derive the rule for adjusting the weight from a presynaptic neuron j to a\npostsynaptic sSRM neuron i, so as to minimize the entropy of i's response given\na particular spike sequence from j. A spike sequence is described by the set of all\ntimes at which spikes have occurred within some interval between 0 and T , denoted\nF T for neuron j. We assume the interval is wide enough that spikes outside the\n j\ninterval do not influence the state of the neuron within the interval (e.g., through\nthreshold reset effects). We can then treat intervals as independent of each other.\nLet the postsynaptic neuron i produce a response i, where i is the set of all\npossible responses given the input, FT , and g() is the probability density over\n i\nresponses. The differential conditional entropy h(i) of neuron i's response is then\ndefined as:\n\n h(i) = - g()log g() d. (5)\n i\n\nTo minimize the differential conditional entropy by adjusting the neuron's weights,\nwe compute the gradient of the conditional entropy with respect to the weights:\n h(i) log(g())\n = - g() log(g()) + 1 d. (6)\n wij wij\n i\n\nFor a differentiable neuron model, log(g())/wij can be expressed as follows when\nneuron i fires once at time ^\n fi [18]:\n log(g()) T (u u (t - ^\n fi) - (ui(t))\n = i(t)) i(t) dt, (7)\n wij t=0 ui(t) wij (ui(t))\nwhere (.) is the Dirac delta, and (ui(t)) is the firing probability-density of neuron\ni at time t. (See [18] for the generalization to multiple postsynaptic spikes.) With\nthe sSRM we can compute the partial derivatives (ui(t))/ui(t) and ui(t)/wij.\nGiven the density function (4),\n (ui(t)) u\n = , i(t) = (t - ^\n f\n u i, t - fj ).\n i(t) 1 + exp(( - ui(t)) wij\nTo perform gradient descent in the conditional entropy, we use the weight update\n h(\nw i)\n ij - (8)\n wij\n\n T (t - ^\n fi, t - fj) (t - ^\n fi) - (ui(t))\n g() log(g()) + 1 dt d.\n (1 + exp(( - ui(t)))(ui(t))\n i t=0\nWe can use numerical methods to evaluate Equation (8). However, it seems bio-\nlogically unrealistic to suppose a neuron can integrate over all possible responses .\nThis dilemma can be circumvented in two ways. First, the resulting learning rule\nmight be cached in some form through evolution so that the full computation is not\nnecessary (e.g., in an STDP curve). Second, the specific response produced by a\nneuron on a single trial might be considered to be a sample from the distribution\ng(), and the integration is performed by a sampling process over repeated trials;\n\n\f\nFigure 2: (a) Experimental setup of Zhang et al. and (b) their experimental STDP curve\n(small squares) vs. our model (solid line). Model parameters: s = 1.5ms, m = 12.25ms.\n\neach trial would produce a stochastic gradient step.\n\n4 Simulation Methodology\n\nWe model in detail the experiment of Zhang et al. [12] (Figure 2a). In this exper-\niment, a post neuron is identified that has two neurons projecting to it, call them\nthe pre and the driver. The pre is subthreshold: it produces depolarization but no\nspike. The driver is suprathreshold: it induces a spike in the post. Plasticity of\nthe pre-post synapse is measured as a function of the timing between pre and post\nspikes (tpre-post) by varying the timing between induced spikes in the pre and the\ndriver (tpre-driver). This measurement yields the well-known STDP curve (Figure\n1b).1 The experiment imposes several constraints on a simulation: The driver alone\ncauses spiking > 70% of the time, the pre alone causes spiking < 10% of the time,\nsynchronous firing of driver and pre cause LTP if and only if the post fires, and the\ntime constants of the EPSPs--s and m in the sSRM--are in the range of 13ms\nand 1015ms respectively. These constraints remove many free parameters from\nour simulation. We do not explicitly model the two input cells; instead, we model\nthe EPSPs they produce. The magnitude of these EPSPs are picked to satisfy the\nexperimental constraints: the driver EPSP alone causes a spike in the post on 77.4%\nof trials, and the pre EPSP alone causes a spike on fewer than 0.1% of trials. Free\nparameters of the simulation are and in the spike-probability function ( can be\nfolded into ), and the magnitude (us, u , f , \n r abs) and reset time constants ( s\n r r abs).\nThe dependent variable of the simulation is tpre-driver, and we measure the time\nof the post spike to determine tpre-post. We estimate the weight update for a\ngiven tpre-driver using Equation 8, approximating the integral by a summation\nover all time-discretized output responses consisting of 0, 1, or 2 spikes. Three or\nmore spikes have a probability that is vanishingly small.\n\n5 Results\n\nFigure 2b shows a typical STDP curve obtained from the model by plotting the\nestimated weight update of Equation 8 against tpre-post. The model also explains\na key finding that has not been explained by any other account, namely, that the\nmagnitude of LTP or LTD decreases as the efficacy of the synapse between the pre\nand the post increases [2]. Further, the dependence is stronger for LTP than LTD.\nFigure 3a plots the magnitude of LTP for tpre-post = -5 ms and the magnitude\nof LTD for tpre-post = 7 ms as the amplitude of the pre's EPSP is increased.\nThe magnitude of the weight change decreases as the weight increases, and this\n\n 1In most experimental studies of STDP, the driver neuron is not used: the post is\ninduced to spike by a direct depolarizing current injection. Modeling current injections\nrequires additional assumptions. Consequently, we focus on the Zhang et al. experiment.\n\n\f\nFigure 3: (a) LTP and LTD plasticity as a function of synaptic efficacy of the subthreshold\ninput. (b)-(d) STDP curves predicted by model as m, us, and are manipulated.\n r\n\n\neffect is stronger for LTP than LTD. The model's explanation for this phenomenon\nis simple: As the weight increases, its effect saturates, and a small change to the\nweight does little to alter its influence. Consequently, the gradient of the entropy\nwith respect to the weight goes toward zero.\nThe qualitative shape of the STDP curve is robust to settings of the model's pa-\nrameters, e.g., the EPSP decay time constant m (Figure 3b), the strength of the\nthreshold reset us (Figure 3c), and the spiking threshold (Figure 3d). Addition-\n r\nally, the spike-probability function (exponential, sigmoidal, or linear) is not critical.\nThe model makes two predictions relating the shape of the STDP curve to proper-\nties of a neuron. These predictions are empirically testable if a diverse population\nof cells can be studied: (1) the width of the LTD and LTP windows should depend\non the EPSP decay time constant (Figure 3b), (2) the strength of LTP to LTD\nshould depend on the strength of the threshold reset (Figure 3c), because stronger\nresets lead to reduced LTD by reducing the probability of a second spike.\n\n6 Discussion\n\nIn this paper, we explored a fundamental computational principle, that synapses\nadapt so as to minimize the variability of a neuron's response in the face of\nnoisy inputs, yielding more reliable neural representations. From this principle--\ninstantiated as conditional entropy minimization--we derived the STDP learning\ncurve. Importantly, the simulation methodology we used to derive the curve closely\nfollows the procedure used in neurophysiological experiments [12]. Our simulations\nobtain an STDP curve that is robust to model parameters and details of the noise\ndistribution.\nOur results are critically dependent on the use of Gerstner's stochastic Spike Re-\nsponse Model, whose dynamics are a good approximation to those of a biological\nspiking neuron. The sSRM has the virtue of being characterized by parameters that\nare readily related to neural dynamics, and its dynamics are differentiable, allowing\nus to derive a gradient-descent learning rule.\n\n\f\nOur simulations are based on the classical STDP experiment in which a single\npresynaptic spike is paired with a single postsynaptic spike. The same methodology\ncan be applied to the situation in which there are multiple presynaptic and/or\npostsynaptic spikes, although the computation involved becomes nontrivial. We\nare currently modeling the data from multi-spike experiments.\nWe modeled the Zhang et al. experiment in which a driver neuron is used to induce\nthe post to fire. To induce the post to fire, most other studies use a depolarizing\ncurrent injection. We are not aware of any established model for current injection\nwithin the SRM framework, and we are currently elaborating such a model. We\nexpect to then be able to simulate experiments in which current injections are used,\nallowing us to investigate the interesting issue of whether the two experimental\ntechniques produce different forms of STDP.\nAcknowledgement Work of SMB supported by the Netherlands Organization for\nScientific Research (NWO), TALENT grant S-62 588.\n\nReferences\n [1] G-q. Bi and M-m. Poo. Synaptic modification by correlated activity: Hebb's postulate\n revisited. Ann. Rev. Neurosci., 24:139166, 2001.\n [2] A. Kepecs, M.C.W. van Rossum, S. Song, and J. Tegner. Spike-timing-dependent\n plasticity: common themes and divergent vistas. Biol. Cybern., 87:446458, 2002.\n [3] A. Saudargiene, B. Porr, and F. W\n org\n otter. How the shape of pre- and postsynaptic\n signals can influence stdp: A biophysical model. Neural Comp., 16:595625, 2004.\n [4] W. Gerstner, R. Kempter, J. L. van Hemmen, and H. Wagner. A neural learning rule\n for sub-millisecond temporal coding. Nature, 383:7678, 1996.\n [5] S. Song, K. Miller, and L. Abbott. Competitive hebbian learning through spiketime\n -dependent synaptic plasticity. Nat. Neurosci., 3:919926, 2000.\n [6] R. van Rossum, G.-q. Bi, and G.G. Turrigiano. Stable hebbian learning from spike\n time dependent plasticity. J. Neurosci., 20:88128821, 2000.\n [7] L.F. Abbott and W. Gerstner. Homeostasis and Learning through STDP. In D. Hansel\n et al(eds), Methods and Models in Neurophysics, 2004.\n [8] R.P.N. Rao and T.J. Sejnowski. Spike-timing-dependent plasticity as temporal dif-\n ference learning. Neural Comp., 13:22212237, 2001.\n [9] P. Dayan and M. H\n ausser. Plasticity kernels and temporal statistics. In S. Thrun,\n L. Saul, and B. Sch\n olkopf, editors, NIPS 16. 2004.\n[10] G. Chechik. Spike-timing-dependent plascticity and relevant mutual information max-\n imization. Neural Comp., 15:14811510, 2003.\n[11] R.C. Froemke and Y. Dan. Spike-timing-dependent synaptic modification induced by\n natural spike trains. Nature, 416:433438, 2002.\n[12] L.l. Zhang, H.W. Tao, C.E. Holt, W.A. Harris, and M-m. Poo. A critical window\n for cooperation and competition among developing retinotectal synapses. Nature,\n 395:3744, 1998.\n[13] W. Gerstner. A framework for spiking neuron models: The spike response model. In\n F. Moss & S. Gielen (eds), The Handbook of Biol. Physics, vol 4, pp 469516, 2001.\n[14] A.J. Bell and L.C. Parra. Maximizing information yields spike timing dependent\n plasticity. NIPS 17. 2005.\n[15] T. Toyoizumi, J-P. Pfister, K. Aihara, and W. Gerstner. Spike-timing dependent\n plasticity and mutual information maximization for a spiking neuron model. NIPS\n 17. 2005.\n[16] R. Jolivet, T.J. Lewis, and W. Gerstner. The spike response model: a framework to\n predict neuronal spike trains. In Kaynak et al.(eds), Proc. ICANN/ICONIP 2003, pp\n 846853. 2003.\n[17] A. Herrmann and W. Gerstner. Noise and the PSTH response to current transients:\n I. J. Comp. Neurosci., 11:135151, 2001.\n[18] X. Xie and H.S. Seung. Learning in neural networks by reinforcement of irregular\n spiking. Physical Review E, 69(041909), 2004.\n\n\f\n", "award": [], "sourceid": 2589, "authors": [{"given_name": "Sander", "family_name": "Bohte", "institution": null}, {"given_name": "Michael", "family_name": "Mozer", "institution": null}]}