{"title": "Bayesian Inference for Spiking Neuron Models with a Sparsity Prior", "book": "Advances in Neural Information Processing Systems", "page_first": 529, "page_last": 536, "abstract": null, "full_text": "Bayesian Inference for Spiking Neuron Models\n\nwith a Sparsity Prior\n\nSebastian Gerwinn\n\nJakob H Macke\n\nMatthias Seeger\n\nMatthias Bethge\n\nSpemannstrasse 41\n\nMax Planck Institute for Biological Cybernetics\n\n{firstname.surname}@tuebingen.mpg.de\n\n72076 Tuebingen, Germany\n\nAbstract\n\nGeneralized linear models are the most commonly used tools to describe the stim-\nulus selectivity of sensory neurons. Here we present a Bayesian treatment of such\nmodels. Using the expectation propagation algorithm, we are able to approximate\nthe full posterior distribution over all weights. In addition, we use a Laplacian\nprior to favor sparse solutions. Therefore, stimulus features that do not critically\nin\ufb02uence neural activity will be assigned zero weights and thus be effectively\nexcluded by the model. This feature selection mechanism facilitates both the in-\nterpretation of the neuron model as well as its predictive abilities. The posterior\ndistribution can be used to obtain con\ufb01dence intervals which makes it possible\nto assess the statistical signi\ufb01cance of the solution. In neural data analysis, the\navailable amount of experimental measurements is often limited whereas the pa-\nrameter space is large. In such a situation, both regularization by a sparsity prior\nand uncertainty estimates for the model parameters are essential. We apply our\nmethod to multi-electrode recordings of retinal ganglion cells and use our uncer-\ntainty estimate to test the statistical signi\ufb01cance of functional couplings between\nneurons. Furthermore we used the sparsity of the Laplace prior to select those\n\ufb01lters from a spike-triggered covariance analysis that are most informative about\nthe neural response.\n\n1 Introduction\n\nA central goal of systems neuroscience is to identify the functional relationship between environ-\nmental stimuli and a neural response. Given an arbitrary stimulus we would like to predict the neural\nresponse as well as possible. In order to achieve this goal with limited amount of data, it is essential\nto combine the information in the data with prior knowledge about neural function. To this end,\ngeneralized linear models (GLMs) have proven to be particularly useful as they allow for \ufb02exible\nmodel architectures while still being tractable for estimation.\nThe GLM neuron model consists of a linear \ufb01lter, a static nonlinear transfer function and a Poisson\nspike generating mechanism. To determine the neural response to a given stimulus, the stimulus\nis \ufb01rst convolved with the linear \ufb01lter (i.e. the receptive \ufb01eld of the neuron). Subsequently, the\n\ufb01lter output is converted into an instantaneous \ufb01ring rate via a static nonlinear transfer function,\nand \ufb01nally spikes are generated from an inhomogeneous Poisson-process according to this \ufb01ring\nrate. Note, however, that the GLM neuron model is not limited to describe neurons with Poisson\n\ufb01ring statistics. Rather, it is possible to incorporate in\ufb02uences of its own spiking history on the\nneural response. That is, the \ufb01ring rate is then determined by a combination of both the external\n\n1\n\n\fstimulus and the spiking-history of the neuron. Thus, the model can account for typical effects\nsuch as refractory periods, bursting behavior or spike-frequency adaptation. Last but not least, the\nGLM neuron model can also be applied for populations of coupled neurons by making each neuron\ndependent not only on its own spiking activity but also on the spiking history of all the other neurons.\nIn previous work (Pillow et al., 2005; Chornoboy et al., 1988; Okatan et al., 2005) it has been\nshown how point-estimates of the GLM-parameters can be obtained using maximum-likelihood (or\nmaximum a posteriori (MAP)) techniques. Here, we extend this approach one step further by using\nBayesian inference methods in order to obtain an approximation to the full posterior distribution,\nrather than point estimates. In particular, the posterior determines con\ufb01dence intervals for every\nlinear weight, which facilitates the interpretation of the model and its parameters. For example, if\na weight describes the strength of coupling between two neurons, then we can use these con\ufb01dence\nintervals to test whether this weight is signi\ufb01cantly different from zero. In this way, we can readily\ndistinguish statistical signi\ufb01cant interactions between neurons from spurious couplings.\nAnother application of the Bayesian GLM neuron model arises in the context of spike-triggered\ncovariance analysis. Spike-triggered covariance basically employs a quadratic expansion of the\nexternal stimulus parameter space and is often used in order to determine the most informative\nsubspace. By combining spike-triggered covariance analysis with the Bayesian GLM framework,\nwe will present a new method for selecting the \ufb01lters of this subspace.\nFeature selection in the GLM neuron model can be done by the assumption of a Laplace prior over\nthe linear weights, which naturally leads to sparse posterior solutions. Consequently, all weights\nare equally strongly pushed to zero. This contrasts the Gaussian prior which pushes weights to zero\nproportional to their absolute value. In this sense, the Laplace prior can also be seen as an ef\ufb01cient\nregularizer, which is well suited for the situation when a large range of alternative explanations for\nthe neural response shall be compared on the basis of limited data. As we do not perform gradient\ndescent on the posterior, differentiability of the posterior is not required.\nThe paper is organized as follows: In section 2, we describe the model, and the \u201cexpectation prop-\nagation\u201d algorithm (Minka, 2001; Opper & Winther, 2000) used to \ufb01nd the approximate posterior\ndistribution. In section 3, we estimate the receptive \ufb01elds, spike-history effects and functional cou-\nplings of a small population of retinal ganglion cells. We demonstrate that for small training sets,\nthe Laplace-prior leads to superior performance compared to a Gaussian-prior, which does not lead\nto sparse solutions. We use the con\ufb01dence intervals to test whether the functional couplings between\nthe neurons are signi\ufb01cant.\nIn section 4, we use the GLM neuron model to describe a complex cell response recorded in macaque\nprimary visual cortex: After computing the spike-triggered covariance (STC) we determine the\nrelevant stimulus subspace via feature selection in our model. In contrast to the usual approach, the\nselection of the subspace in our case becomes directly linked to an explicit neuron model which also\ntakes into account the spike-history dependence of the spike generation.\n\n2 Generalized Linear Models and Expectation Propagation\n\n2.1 Generalized Linear Models\nLet Xt \u2208 Rd, t \u2208 [0, T ] denote a time-varying stimulus and Di = {ti,j} the spike-times of i =\n1, . . . , n neurons. Here Xt consists of the sensory input at time t and can include preceeding input\nframes as well. We assume that the stimulus can only change at distinct time points, but can be\nevaluated at continous time t. We would like to incorporate spike-history effects, couplings between\nneurons and dependence on nonlinear features of the stimulus. Therefore, we describe the effective\ninput to a neuron via the following feature-map:\n\n\u03c8(t) = \u03c8st (Xt)M\n\n\u03c8sp({ti,j \u2208 Di : ti,j < t}),\n\ni\n\nwhere \u03c8sp represents the spike time history and \u03c8st the possibly nonlinear feature map for the\nstimulus. That is, the complete feature vector \u03c8 contains possibly nonlinear features of the stimulus\nand the spike history of every neuron. Any feature which is causal in the sense that it does not\ndepend on future events can be used. We model the spike history dependence by a set of small time\n\n2\n\n\fwindows [t \u2212 \u03c4l, t \u2212 \u03c40\n\nl ) in which occuring spikes are counted.\n\n(\u03c8sp,i({ti,j \u2208 Di : ti,j < t}))l = X\n\n1[t\u2212\u03c4l,t\u2212\u03c40\n\nl )(ti,j)\n\n,\n\nj:ti,j