{"title": "Low-dimensional models of neural population activity in sensory cortical circuits", "book": "Advances in Neural Information Processing Systems", "page_first": 343, "page_last": 351, "abstract": "Neural responses in visual cortex are influenced by visual stimuli and by ongoing spiking activity in local circuits. An important challenge in computational neuroscience is to develop models that can account for both of these features in large multi-neuron recordings and to reveal how stimulus representations interact with and depend on cortical dynamics. Here we introduce a statistical model of neural population activity that integrates a nonlinear receptive field model with a latent dynamical model of ongoing cortical activity. This model captures the temporal dynamics, effective network connectivity in large population recordings, and correlations due to shared stimulus drive as well as common noise. Moreover, because the nonlinear stimulus inputs are mixed by the ongoing dynamics, the model can account for a relatively large number of idiosyncratic receptive field shapes with a small number of nonlinear inputs to a low-dimensional latent dynamical model. We introduce a fast estimation method using online expectation maximization with Laplace approximations. Inference scales linearly in both population size and recording duration. We apply this model to multi-channel recordings from primary visual cortex and show that it accounts for a large number of individual neural receptive fields using a small number of nonlinear inputs and a low-dimensional dynamical model.", "full_text": "Low-dimensional models of neural population activity\n\nin sensory cortical circuits\n\nEvan Archer1,2, Urs K\u00a8oster3, Jonathan Pillow4, Jakob H. Macke1,2\n\n1Max Planck Institute for Biological Cybernetics, T\u00a8ubingen\n2Bernstein Center for Computational Neuroscience, T\u00a8ubingen\n\n3Redwood Center for Theoretical Neuroscience, University of California at Berkeley\n4Princeton Neuroscience Institute, Department of Psychology, Princeton University\n\nevan.archer@tuebingen.mpg.de, urs@nervanasys.com\n\npillow@princeton.edu, jakob@tuebingen.mpg.de\n\nAbstract\n\nNeural responses in visual cortex are in\ufb02uenced by visual stimuli and by ongo-\ning spiking activity in local circuits. An important challenge in computational\nneuroscience is to develop models that can account for both of these features in\nlarge multi-neuron recordings and to reveal how stimulus representations interact\nwith and depend on cortical dynamics. Here we introduce a statistical model of\nneural population activity that integrates a nonlinear receptive \ufb01eld model with a\nlatent dynamical model of ongoing cortical activity. This model captures temporal\ndynamics and correlations due to shared stimulus drive as well as common noise.\nMoreover, because the nonlinear stimulus inputs are mixed by the ongoing dynam-\nics, the model can account for a multiple idiosyncratic receptive \ufb01eld shapes with\na small number of nonlinear inputs to a low-dimensional dynamical model. We\nintroduce a fast estimation method using online expectation maximization with\nLaplace approximations, for which inference scales linearly in both population\nsize and recording duration. We test this model to multi-channel recordings from\nprimary visual cortex and show that it accounts for neural tuning properties as\nwell as cross-neural correlations.\n\n1\n\nIntroduction\n\nNeurons in sensory cortices organize into highly-interconnected circuits that share common input,\ndynamics, and function. For example, across a cortical column, neurons may share stimulus de-\npendence as a result of sampling the same location of visual space, having similar orientation\npreference [1] or receptive \ufb01elds with shared sub-units [2]. As a result, a substantial fraction of\nstimulus-information can be redundant across neurons [3]. Recent advances in electrophysiology\nand functional imaging allow us to simultaneously probe the responses of the neurons in a column.\nHowever, the high dimensionality and (relatively) short duration of the resulting data renders analy-\nsis a dif\ufb01cult statistical problem.\nRecent approaches to modeling neural activity in visual cortex have focused on characterizing the re-\nsponses of individual neurons by linearly projecting the stimulus on a small feature subspace that op-\ntimally drives the cell [4, 5]. Such \u201csystems-identi\ufb01cation\u201d approaches seek to describe the stimulus-\nselectivity of single neurons separately, treating each neuron as an independent computational unit.\nOther studies have focused on providing probabilistic models of the dynamics of neural populations,\nseeking to elucidate the internal dynamics underlying neural responses [6, 7, 8, 9, 10, 11]. These\napproaches, however, typically do not model the effect of the stimulus (or do so using only a linear\nstimulus drive). To realize the potential of modern recording technologies and to progress our un-\n\n1\n\n\fderstanding of neural population coding, we need methods for extracting both the features that drive\na neural population and the resulting population dynamics [12].\nWe propose the Quadratic Input Latent Dynamical System (QLDS) model, a statistical model that\ncombines a low-dimensional representation of population dynamics [9] with a low-dimensional de-\nscription of stimulus selectivity [13]. A low-dimensional dynamical system governs the population\nresponse, and receives a nonlinear (quadratic) stimulus-dependent input. We model neural spike\nresponses as Poisson (conditional on the latent state), with exponential \ufb01ring rate-nonlinearities. As\na result, population dynamics and stimulus drive interact multiplicatively to modulate neural \ufb01r-\ning. By modeling dynamics and stimulus dependence, our method captures correlations in response\nvariability while also uncovering stimulus selectivity shared across a population.\n\nFigure 1: Schematic illustrating the Quadratic input latent dynamical system model (QLDS).\nThe sensory stimulus is \ufb01ltered by multiple units with quadratic stimulus selectivity (only one of\nwhich is shown) which model the feed-forward input into the population. This stimulus-drive pro-\nvides input into a multi-dimensional linear dynamical system model which models recurrent dynam-\nics and shared noise within the population. Finally, each neuron yi in the population is in\ufb02uenced\nby the dynamical system via a linear readout. QLDS therefore models both the stimulus selectivity\nas well as the spatio-temporal correlations of the population.\n\n2 The Quadratic Input Latent Dynamical System (QLDS) model\n\n2.1 Model\n\nWe summarize the collective dynamics of a population using a linear, low-dimensional dynamical\nsystem with an n-dimensional latent state xt. The evolution of xt is given by\n\nxt = Axt1 + f(ht) + \u270ft,\n\n(1)\nwhere A is the n \u21e5 n dynamics matrix and \u270f is Gaussian innovation noise with covariance matrix\nQ, \u270ft \u21e0N (0, Q). Each stimulus ht drives some dimensions of xt via a nonlinear function of the\nstimulus, f, with parameters , where the exact form of f (\u00b7) will be discussed below. The log\n\ufb01ring rates zt of the population couple to the latent state xt via a loading matrix C,\n(2)\nHere, we also include a second external input st, which is used to model the dependence of the\n\ufb01ring rate of each neuron on its own spiking history [14]. We de\ufb01ne D \u21e4 st to be that vector\nwhose k-th element is given by (D \u21e4 st)k \u2318PNs\ni=1 Dk,isk,ti. D therefore models single-neuron\nproperties that are not explained by shared population dynamics, and captures neural properties such\nas burstiness or refractory periods. The vector d represents a constant, private spike rate for each\nneuron. The vector xt represents the n-dimensional state of m neurons. Typically n < m, so the\nmodel parameterizes a low-dimensional dynamics for the population.\nWe assume that, conditional on zt, the observed activity yt of m neurons is Poisson-distributed,\n\nzt = Cxt + D \u21e4 st + d.\n\n(3)\nWhile the Poisson likelihood provides a realistic probabilistic model for the discrete nature of spik-\ning responses, it makes learning and inference more challenging than it would be for a Gaussian\nmodel. As we discuss in the subsequent section, we rely on computationally-ef\ufb01cient approxima-\ntions to perform inference under the Poisson observation model for QLDS.\n\nyk,t \u21e0 Poisson(exp(zk,t)).\n\n2\n\nstimulus...quadraticlinearfilterspopulation spikeresponseintrinsicnoiselinearupdatelineardynamics+Anonlinearfunctionnoise\f2.2 Nonlinear stimulus dependence\n\nIndividual neurons in visual cortex respond selectively to only a small subset of stimulus features\n[4, 15]. Certain subpopulations of neurons, such as in a cortical column, share substantial receptive\n\ufb01eld overlap. We model such a neural subpopulation as sensitive to stimulus variation in a linear\nsubspace of stimulus space, and seek to characterize this subspace by learning a set of basis vectors,\nor receptive \ufb01elds, wi. In QLDS, a subset of latent states receives a nonlinear stimulus drive, each\nof which re\ufb02ects modulation by a particular receptive \ufb01eld wi. We consider three different forms\nof stimulus model: a fully linear model, and two distinct quadratic models. Although it is possi-\nble to incorporate more complicated stimulus models within the QLDS framework, the quadratic\nmodels\u2019 compact parameterization and analytic elegance make them both \ufb02exible and computation-\nally tractable. What\u2019s more, quadratic stimulus models appear in many classical models of neural\ncomputation, e.g. the Adelson-Bergen model for motion-selectivity [16]; quadratic models are also\nsometimes used in the classi\ufb01cation of simple and complex cells in area V1 [4].\nWe express our stimulus model by the function f(ht), where  represents the set of parameters de-\nscribing the stimulus \ufb01lters wi and mixing parameters ai, bi and ci (in the case of the quadratic mod-\nels). When fB(ht) is identically 0 (no stimulus input), the QLDS with Poisson observations reduces\nto what has been previously studied as the Poisson Latent Dynamical System (PLDS) [17, 18, 9].\nWe brie\ufb02y review three stimulus models we consider, and discuss their computational properties.\n\nLinear: The simplest stimulus model we consider is a linear function of the stimulus,\n\n(4)\nwhere the rows of B as linear \ufb01lters, and  = {B}. This baseline model is identical to [18, 9] and\ncaptures simple cell-like receptive \ufb01elds since the input to latent states is linear and the observation\nprocess is generalized linear.\n\nf (ht) = Bht,\n\nQuadratic: Under the linear model, latent dynamics receive linear input from the stimulus along\na single \ufb01lter dimension, wi.\nIn the quadratic model, we permit the input to each state to be a\nquadratic function of wi. We describe the quadratic by including three additional parameters per\nlatent dimension, so that the stimulus drive takes the form\n\nfB,i(ht) = aiwT\n\n(5)\nHere, the parameters  = {wi, ai, bi, ci : i = 1, . . . , m} include multiple stimulus \ufb01lters wi and\nquadratic parameters (ai, bi, ci). Equation 5 might result in a stimulus input that has non-zero mean\nwith respect to the distribution of the stimulus ht, which may be undesirable. Given the covariance\nof ht, it is straightforward to constrain the input to be zero-mean by setting ci = aiwT\ni \u2303wi, where\n\u2303 is the covariance of ht and we assume the stimulus to have zero mean as well. The quadratic model\nenables QLDS to capture phase-invariant responses, like those of complex cells in area V1.\n\ni ht + ci.\n\n+ biwT\n\ni ht2\n\nQuadratic with multiplicative interactions:\nIn the above model, there are no interactions be-\ntween different stimulus \ufb01lters, which makes it dif\ufb01cult to model suppressive or facilitating interac-\ntions between features [4]. Although contributions from different \ufb01lters combine in the dynamics\nof x, any interactions are linear. Our third stimulus model allows for multiplicative interactions\nbetween r < m stimulus \ufb01lters, with the i-th dimension of the input given by\n\nf,i(ht) =\n\nTht + ci.\nAgain, we constrain this function to have zero mean by setting ci = Pr\n\nj ht + biwi\n\nThtwT\n\nai,jwi\n\nrXj=1\n\nj=1 ai,jwT\n\ni \u2303wj.\n\n2.3 Learning & Inference\n\nWe learn all parameters via the expectation-maximization (EM) algorithm. EM proceeds by alter-\nnating between expectation (E) and maximization (M) steps, iteratively maximizing a lower-bound\nto the log likelihood [19]. In the E-step, one infers the distribution over trajectories xt, given data\nand the parameter estimates from the previous iteration. In the M-step, one updates the current pa-\nrameter estimates by maximizing the expectation of the log likelihood, a lower bound on the log\nlikelihood. EM is a standard method for \ufb01tting latent dynamical models; however, the Poisson\nobservation model complicates computation and requires the use of approximations.\n\n3\n\n\fE-step: With Gaussian latent states xt, posterior inference amounts to computing the posterior\nmeans \u00b5t and covariances Qt of the latent states, given data and current parameters. With Pois-\nson observations exact inference becomes intractable, so that approximate inference has to be used\n[18, 20, 21, 22]. Here, we apply a global Laplace approximation [20, 9] to ef\ufb01ciently (linearly\nin experiment duration T ) approximate the posterior distribution by a Gaussian. We note that each\nfB(ht) in the E-step is deterministic, making posterior inference identical to standard PLDS models.\nWe found a small number of iterations of Newton\u2019s method suf\ufb01cient to perform the E-step.\n\nM-step:\nIn the M-step, each parameter is updated using the means \u00b5t and covariances Qt inferred\nin the E-step. Given \u00b5t and Qt, the parameters A and Q have closed-form update rules that are\nderived in standard texts [23]. For the Poisson likelihood, the M-step requires nonlinear optimization\nto update the parameters C, D and d [18, 9]. While for linear stimulus functions f(ht) the M-\nstep has a closed-form solution, for nonlinear stimulus functions we optimize  numerically. The\nobjective function for  given by\n\ng() = \n\n1\n2\n\nTXt=2\u21e5(\u00b5t  A\u00b5t1  f(ht))TQ1(\u00b5t  A\u00b5t1  f(ht))\u21e4 + const.,\n\nwhere \u00b5t = E[xt|yt1, ht]. If  is represented as a vector concatenating all of its parameters, the\ngradient of g() takes the form\n\n@g()\n\n@\n\n= Q1\n\nTXt=2\n\n(\u00b5t  A\u00b5t1  f(ht))\n\n@f (ht)\n\n@\n\n.\n\nFor the quadratic nonlinearity, the gradients with respect to f (ht) take the form\n\n= 2hai\u21e3ht\n\nTwi\u2318 + bii ht\n\n= ht\n\nTwi,\n\n@f (ht)\n\n@wi\n\n@f (ht)\n\n@bi\n\nT,\n\n@f (ht)\n\n@ai\n\n@f (ht)\n\n@ci\n\n=\u21e3ht\n\nTwi\u23182\n\n= 1.\n\n(6)\n\n(7)\n\n(8)\n\n,\n\nGradients for the quadratic model with multiplicative interactions take a similar form. When con-\nstrained to be 0-mean, the gradient for ci disappears, and is replaced by an additional term in the\ngradients for a and wi (arising from the constraint on c).\nWe found both computation time and quality of \ufb01t for QLDS to depend strongly upon the optimiza-\ntion procedure used. For long time series, we split the data into small minibatches. The QLDS E-step\nand M-step each naturally parallelize across minibatches. Neurophysiological experiments are often\nnaturally segmented into separate trials across different stimuli and experimental conditions, making\nit possible to select minibatches without boundary effects.\n\n3 Application to simulated data\n\nWe illustrate the properties of QLDS using a simulated population recording of 100 neurons, each\nresponding to a visual stimulus of binary, white spatio-temporal noise of dimensionality 240. We\nsimulated a recording with T = 50000 samples and a 10-dimensional latent dynamical state. Five of\nthe latent states received stimulus input from a bank of 5 stimulus \ufb01lters (see Fig. 2A, top row), and\nthe remaining latent dimensions only had recurrent dynamics and noise. We aimed to approximate\nthe properties of real neural populations in early sensory cortex. In particular, we set the dynamics\nmatrix A by \ufb01tting the model to a single neuron recording from V1 [4]. When \ufb01tting the model,\nwe assumed the same dimensionalities (10 latent states, 5 stimulus inputs) as those used to generate\nthe data. We ran 100 iterations of EM, which\u2014-for the recording length and dimensionality of this\nsystem\u2014took about an hour on a 12\u2013core intel Xeon CPU at 3.5GHz.\nThe model recovered by EM matched the statistics of the true model well. Linear dynamical system\nand quadratic models of stimulus selectivity both commonly have invariances that render a particular\nparameterization unidenti\ufb01able [4, 15], and QLDS is no exception: the latent state (and its parame-\nters) can be rotated without changing the model\u2019s properties. Hence it is possible only to compare\nthe subspace recovered by the model, and not the individual \ufb01lters. In order to visualize subspace\nrecovery, we computed the best `2 approximation of the 5 \u201ctrue\u201d \ufb01lters in the subspace spanned by\n\n4\n\n\fA\n\nB\n\nC\n\nl\n\ns\nn\no\ni\nt\na\ne\nr\nr\no\nc\n \nl\na\nt\no\nT\n\nl\n\ns\nn\no\ni\nt\na\ne\nr\nr\no\nc\n \ns\nu\nu\nm\n\nl\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\n\u22120.2\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\ni\nt\n\nS\n\n\u22120.2\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\n\u22120.2\n\nl\n\ns\nn\no\ni\nt\na\ne\nr\nr\no\nc\n \ne\ns\no\nN\n\ni\n\n20\n\n40\n60\n80\n100\n\n20\n40\n60\n80\n100\n\n20\n40\n60\n80\n100\n\nD\n\n0.5 eigenvalues of A\n\nt\ni\nf\n\ntrue\n20 40 60 80 100\n\ni\n\ny\nr\na\nn\ng\na\nm\n\ni\n\n0\n\n\u22120.5\n\nE\n\nl\n\nn\no\ni\nt\na\ne\nr\nr\no\nc\n \ne\ns\no\nn\n\ni\n\n0.2\n0.15\n0.1\n0.05\n0\n\u22120.05\n\nt\ni\nf\n\ntrue\n20 40 60 80 100\n\nt\ni\nf\n\ntrue\n20 40 60 80 100\n\nF\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\np\n\n0.4\n\n0.3\n\n0.2\n\n0.1\n\n0\n\n0.2 0.4 0.6 0.8\nnoise vs stimulus\n\nreal\n\ncorrelations\n\n0\n\n0.2\nstimulus correlation\n\n0.1\n\ntrue\nfit\n\n20\n\n60\nsynchronous spikes\n\n40\n\nFigure 2: Results on simulated data. Low-dimensional subspace recovery from a population of\n100 simulated neurons in response to a white noise stimulus. (A) Simulated neurons receive shared\ninput from 5 spatio-temporal receptive \ufb01elds (top row). QLDS recovers a subspace capable of\nrepresenting the original 5 \ufb01lters (bottom row). (B) QLDS permits a more compact representation\nthan the conventional approach of mapping receptive \ufb01elds for each neuron. For comparison with\nthe representation in panel A, we here show the spike-triggered averages of the \ufb01rst 60 neurons in the\npopulation. (C) QLDS also models shared variability across neurons, as visualised here by the three\ndifferent measures of correlation. Top: Total correlation coef\ufb01cients between each pair of neurons.\nValues below the diagonal are from the simulated data, above the diagonal correspond to correlations\nrecovered by the model. Center: Stimulus correlations Bottom: Noise correlations. (D) Eigenvalues\nof dynamics matrix A (black is ground truth, red is estimated). (E) In this model, stimulus and noise\ncorrelations are dependent on each other, for the parameters chosen in this stimulation, there is a\nlinear relationship between them. (F) Distribution of population spike counts, i.e. total number of\nspikes in each time bin across the population.\n\nreconstruction performance\n\nvs population size\n\nB\n\nlinear\nquadratic\nquadratic cross\n\nl\n\n)\ne\na\nc\ns\n \ng\no\nl\n(\n \n\nE\nS\nM\n\nA\n\nl\n\n)\ne\na\nc\ns\n \ng\no\nl\n(\n \n\nE\nS\nM\n\n1\n\n0\n\n\u22121\n\n\u22122\n\n\u22123\n\n\u22124\n\n400\n\n200\nPopulation Size (# Cells)\n\n600\n\n800 1000\n\n2\n1\n0\n\u22121\n\u22122\n\u22123\n\u22124\n\u22125\n\nreconstruction performance\n\nvs experiment length\n\n5000\n\n15000\nExperiment length (# samples)\n\n10000\n\nFigure 3: Recovery of stimulus subspace as a function of population size (A) and experiment dura-\ntion (B). Each point represents the best \ufb01lter reconstruction performance of QLDS over 20 distinct\nsimulations from the same \u201ctrue\u201d model, each initialized randomly and \ufb01t using the same number\nof EM iterations. Models were \ufb01t with each of three distinct stimulus nonlinearities, linear s (blue),\nquadratic (green), and quadratic with multiplicative interactions (red). Stimulus input of the \u201ctrue\u201d\nwas a quadratic with multiplicative interactions, and therefore we expect only the multiplicative\nmodel (red) to each low error rates.\n\nthe estimated \u02c6wi (see Fig. 2 A bottom row). In QLDS, different neurons share different \ufb01lters, and\ntherefore these 5 \ufb01lters provide a compact description of the stimulus selectivity of the population\n[24]. In contrast, for traditional single-neuron analyses [4] \u2018fully-connected\u2019 models such as GLMs\n[14] one would estimate the receptive \ufb01elds of each of the 100 \ufb01lters in the population, resulting in a\nmuch less compact representation with an order of magnitude more parameters for the stimulus-part\nalone (see Fig. 2B).\n\n5\n\n\fQLDS captures both the stimulus-selectivity of a population and correlations across neurons. In\nstudies of neural coding, correlations between neurons (Fig. 2C, top) are often divided into stimulus-\ncorrelations and noise-correlations. Stimulus correlations capture correlations explainable by sim-\nilarity in stimulus dependence (and are calculated by shuf\ufb02ing trials), whereas noise-correlations\ncapture correlations not explainable by shared stimulus drive (which are calculated by correlating\nresiduals after subtracting the mean \ufb01ring rate across multiple presentations of the same stimulus).\nThe QLDS-model was able to recover both the total, stimulus and noise correlations in our simula-\ntion (Fig. 2C), although it was \ufb01t only to a single recording without stimulus repeats. Finally, the\nmodel also recovered the eigenvalues of the dynamics (Fig. 2D), the relationship between noise and\nstimulus correlations (Fig. 2E) and the distribution of population spike counts (Fig. 2F).\nWe assume that all stimulus dependence is captured by the subspace parameterized by the \ufb01lters\nof the stimulus model.\nIf this assumption holds, increasing the size of the population increases\nstatistical power and makes identi\ufb01cation of the stimulus selectivity easier rather than harder, in\na manner similar to that of increasing the duration of the experiment. To illustrate this point, we\ngenerated multiple data-sets with larger population sizes, or with longer recording times, and show\nthat both scenarios lead to improvements in subspace-recovery (see Fig. 3).\n\n4 Applications to Neural Data\n\nCat V1 with white noise stimulus: We evaluate the performance of the QLDS on multi-electrode\nrecordings from cat primary visual cortex. Data were recorded from anaesthetized cats in response to\na single repeat of a 20 minute long, full-\ufb01eld binary noise movie, presented at 30 frames per second,\nand 60 repeats of a 30s long natural movie presented at 150 frames per second. Spiking activity\nwas binned at the frame rate (33 ms for noise, 6.6 ms for natural movies). For noise, we used the\n\ufb01rst 18000 samples for training, and 5000 samples for model validation. For the natural movie, 40\nrepeats were used for training and 20 for validation. Silicon polytrodes (Neuronexus) were employed\nto record multi-unit activity (MUA) from a single cortical column, spanning all cortical layers with\n32 channels. Details of the recording procedure are described elsewhere [25]. For our analyses, we\nused MUA without further spike-sorting from 22 channels for noise data and 25 channels for natural\nmovies. We \ufb01t a QLDS with 3 stimulus \ufb01lters, and in each case a 10-dimensional latent state, i.e. 7\nof the latent dimensions received no stimulus drive.\nSpike trains in this data-set exhibited \u201cburst-like\u201d events in which multiple units were simultaneously\nactive (Fig. 4A). The model captured these events by using a dimension of the latent state with\nsubstantial innovation noise, leading substantial variability in population activity across repeated\nstimulus presentations. We also calculated pairwise (time-lagged) cross-correlations for each unit\npair, as well as the auto-correlation function for each unit in the data (Fig. 4B, 7 out of 22 neurons\nshown, results for other units are qualitatively similar.). We found that samples from the model\n(Fig. 4B, red) closely matched the correlations of the data for most units and unit-pairs, indicating\nthe QLDS provided an accurate representation of the spatio-temporal correlation structure of the\npopulation recording. The instantaneous correlation matrix across all 22 cells was very similar\nbetween the physiological and sampled data (Fig. 4C). We note that receptive \ufb01elds (Fig. 4F) in this\ndata did not have spatio-temporal pro\ufb01les typical of neurons in cat V1 (this was also found when\nusing conventional analyses such as spike-triggered covariance). Upon inspection, this was likely a\nconsequence of an LGN afferent also being included in the raw MUA. In our analysis, a 3-feature\nmodel captured stimulus correlations (in held out data) more accurately than 1- and 2- \ufb01lter models.\nHowever, 10-fold cross validation revealed that 2- and 3- \ufb01lter models do not improve upon a 1-\ufb01lter\nmodel in terms of one-step-ahead prediction performance (i.e. trying to predict neural activity on\nthe next time-step using past observations of population activity and the stimulus).\n\nMacaque V1 with drifting grating stimulus: We wanted to evaluate the ability of the model to\ncapture the correlation structure (i.e. noise and signal correlations) of a data-set containing multiple\nrepetitions of each stimulus. To this end, we \ufb01t QLDS with a Poisson observation model to the\npopulation activity of 113 V1 neurons from an anaesthetized macaque, as described in [26]. Drift-\ning grating stimuli were presented for 1280ms, followed by a 1280ms blank period, with each of\n72 grating orientations repeated 50 times. We \ufb01t a QLDS with a 20-dimensional latent state and 15\nstimulus \ufb01lters, where the stimulus was paramterized as a set of phase-shifted sinusoids at the appro-\npriate spatial and temporal frequency (making ht 112-dimensional). We \ufb01t the QLDS to 35 repeats,\n\n6\n\n\fB\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0\n\n0\n\n0\n\n0\n\n20\n\n20\n\n20\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\neigenvalues of A\n\nC\n1\n\n0.5\n\n0\n\n\u22120.5\n\n\u22121\n\n5\n10\n\n15\n20\n\nTotal correlations\n\nt\ni\nf\n\ntrue\n5\n10\n\n15\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0\n\n20\n\n0\n\n0\n\n0\n\n20\n\n20\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\nF\n\n0\n\n0\n\n20\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0\n\n20\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\n0.4\n\n0.2\n\n0\n\n\u221220\n\nA\n\na\nt\na\nd\n\nl\n\ns\nu\nu\nm\n\ni\n\ni\nt\ns\n \ne\ns\no\nn\n \nl\na\nc\ni\nt\nn\ne\nd\ni\n \no\nt\n \ns\nt\na\ne\np\ne\nr\n \nd\ne\nt\na\nu\nm\nS\n\ni\n\nl\n\n10 20 30 40 50 60 70 80 90\n\n5\n10\n15\n20\n\n5\n10\n15\n20\n\n5\n10\n15\n20\n\ntime (s)\n\n10 20 30 40 50 60 70 80 90\nE\n\nnoise vs stimulus\n\n correlation\n\nD\n\nn\no\n\ni\nt\n\nl\n\na\ne\nr\nr\no\nc\n \ne\ns\no\nn\n\ni\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\ny\nr\na\nn\ng\na\nm\n\ni\n\ni\n\n0.5\n\n0\n\n\u22120.5\n\n0\n\n0.5\n\nreal\n\n1\n\n0.6\n\n0.4\nstimulus correlation\n\n0.8\n\nfeature 1\n\nfeature 2\n\nfeature 3\n\n\u2212165ms\n\n\u2212132ms \u221299ms \u221266ms \u221233ms\n\n0ms\n\nFigure 4: QLDS \ufb01t to V1 cells with noise stimuli. We \ufb01t QLDS to T = 18000 samples of 22\nneurons responding to a white noise stimulus, data binned at 33 ms. We used the quadratic with\nmultiplicative interactions as the stimulus nonlinearity. The QLDS has a 10-dimensional latent state\nwith 3 stimulus inputs. All results shown here are compared against T = 5000 samples of test-data,\nnot used to train the model. (A) Top row: Rasters from recordings from 22 cells in cat visual cortex,\nwhere cell index appears on the y axis, and time in seconds on the x. Second and third row: Two in-\ndependent samples from the QLDS model responding to the same noise stimuli. Note that responses\nare highly variable across trials. (B) Auto- and cross-correlations for data (black) and model (red)\ncells. For the model, we average across 60 independent samples, thickness of red curves re\ufb02ects 1\nstandard deviation from the mean. Panel (i, j) corresponds to cross-correlation between units with\nindices i and j, panels along the diagonal show auto-correlations. (C) Total correlations for the true\n(lower diagonal) and model (upper diagonal) populations. (D) Noise correlations scattered against\nstimulus correlations for the model. As we did not have repeat data for this population, we were not\nable to reliably estimate noise correlations, and thereby evaluate the accuracy of this model-based\n(F) Three stimulus \ufb01lters recovered by\nprediction.\nQLDS. We selected the 3-\ufb01lter QLDS by inspection, having observed that \ufb01tting with larger number\nof stimulus \ufb01lters did not improve the \ufb01t. We note that although two of the \ufb01lters appear similar,\nthat they drive separate latent dimensions with distinct mixing weights ai, bi and ci.\n\n(E) Eigenvalues of the dynamics matrix A.\n\nand held out 15 for validation. The QLDS accurately captured the stimulus and noise correlations of\nthe full population (Fig. 5A). Further, a QLDS with 15 shared receptive \ufb01elds captured simple and\ncomplex cell behavior of all 113 cells, as well as response variation across orientation (Fig. 5B).\n\n5 Discussion\n\nWe presented QLDS, a statistical model for neural population recordings from sensory cortex that\ncombines low-dimensional, quadratic stimulus dependence with a linear dynamical system model.\nThe stimulus model can capture simple and complex cell responses, while the linear dynamics cap-\nture temporal dynamics of the population and shared variability between neurons. We applied QLDS\nto population recordings from primary visual cortex (V1). The cortical microcircuit in V1 consists of\nhighly-interconnected cells that share receptive \ufb01eld properties such as orientation preference [27],\nwith a well-studied laminar organization [1]. Layer IV cells have simple cell receptive \ufb01eld proper-\nties, sending excitatory connections to complex cells in the deep and super\ufb01cial layers. Quadratic\n\n7\n\n\fFigure 5: QLDS \ufb01t to 113 V1 cells across 35 repeats of each of 72 grating orientations. (A)\nComparison of total correlations in the data and generated from the model, (B) For two cells (cells\n49 and 50, using the index scheme from A) and 6 orientations (0, 45, 90, 135, 180, and 225 degrees),\nwe show the posterior mean prediction performance (red traces) in in comparison to the average\nacross 15 held-out trials (black traces).\nIn each block, we show predicted and actual spike rate\n(y-axis) over time binned at 10 ms (x-axis). Stimulus offset is denoted by a vertical blue line.\n\nstimulus models such as the classical \u201cenergy model\u201d [16] of complex cells re\ufb02ect this structure.\nThe motivation of QLDS is to provide a statistical description of receptive \ufb01elds in the different\ncortical layers, and to parsimoniously capture both stimulus dependence and correlations across an\nentire population.\nAnother prominent neural population model is the GLM (Generalized Linear Model, e.g. [14]; or\nthe \u201ccommon input model\u201d, [28]), which includes a separate receptive \ufb01eld for each neuron, as\nwell as spike coupling terms between neurons. While the GLM is a successful model of a popula-\ntion\u2019s statistical response properties, its fully\u2013connected parameterization scales quadratically with\npopulation size. Furthermore, the GLM supposes direct couplings between pairs of neurons, while\nmonosynaptic couplings are statistically unlikely for recordings from a small number of neurons\nembedded in a large network.\nIn QLDS, latent dynamics mediate both stimulus and noise correlations. This re\ufb02ects the structure\nof the cortex, where recurrent connectivity gives rise to both stimulus-dependent and independent\ncorrelations. Without modeling a separate receptive \ufb01eld for each neuron, the model complexity of\nQLDS grows only linearly in population size, rather than quadratically as in fully-connected models\nsuch as the GLM [14]. Conceptually, our modeling approach treats the entire recorded population\nas a single \u201ccomputational unit\u201d, and aims to characterize its joint feature-selectivity and dynamics.\nNeurophysiology and neural coding are progressing toward recording and analyzing datasets of ever\nlarger scale. Population-level parameterizations, such as QLDS, provide a scalable strategy for\nrepresenting and analyzing the collective computational properties of neural populations.\n\nAcknowledgements\n\nWe are thankful to Arnulf Graf and the co-authors of [26] for sharing the data used in Fig. 5, and to\nMemming Park for comments on the manuscript. JHM and EA were funded by the German Federal\nMinistry of Education and Research (BMBF; FKZ: 01GQ1002, Bernstein Center T\u00a8ubingen) and the\nMax Planck Society, and UK by National Eye Institute grant #EY019965. Collaboration between\nEA, JP and JHM initiated at the \u2018MCN\u2019 Course at the Marine Biological Laboratory, Woods Hole.\n\n8\n\nStimulus correlations2040608010020406080100\u22121\u22120.500.51cell indexcell index 20406080100Noise correlations\u22120.1\u22120.0500.050.1modeldatamodeldata0.20.40.6spike rate000.20.40.650010001500time (ms)spike rate50010001500time (ms)50010001500time (ms)50010001500time (ms)50010001500time (ms)50010001500time (ms)0 degrees45 degrees90 degrees135 degrees180 degrees225 degreesCell 49Cell 50stimulus off AB\f2013.\n\nReferences\n[1] D. Hubel and T. Wiesel, \u201cReceptive \ufb01elds, binocular interaction and functional architecture in the cat\u2019s\n\nvisual cortex,\u201d J Physiol, pp. 106\u2013154, 1962.\n\n[2] S. L. Smith and M. H\u00a8ausser, \u201cParallel processing of visual space by neighboring neurons in mouse visual\n\ncortex,\u201d Nature Neurosci, vol. 13, no. 9, pp. 1144\u20139, 2010.\n\n[3] D. S. Reich, F. Mechler, and J. D. Victor, \u201cIndependent and redundant information in nearby cortical\n\nneurons,\u201d Science, vol. 294, pp. 2566\u20132568, 2001.\n\n[4] N. C. Rust, O. Schwartz, J. A. Movshon, and E. P. Simoncelli, \u201cSpatiotemporal elements of macaque v1\n\nreceptive \ufb01elds,\u201d Neuron, vol. 46, no. 6, pp. 945\u201356, 2005.\n\n[5] T. O. Sharpee, \u201cComputational identi\ufb01cation of receptive \ufb01elds,\u201d Annu Rev Neurosci, vol. 36, pp. 103\u201320,\n\n[6] M. M. Churchland, B. M. Yu, M. Sahani, and K. V. Shenoy, \u201cTechniques for extracting single-trial activity\n\npatterns from large-scale neural recordings,\u201d vol. 17, no. 5, pp. 609\u2013618, 2007.\n\n[7] B. M. Yu, J. P. Cunningham, G. Santhanam, S. I. Ryu, K. V. Shenoy, and M. Sahani, \u201cGaussian-process\nfactor analysis for low-dimensional single-trial analysis of neural population activity,\u201d vol. 102, no. 1,\npp. 614\u2013635, 2009.\n\n[8] W. Truccolo, L. R. Hochberg, and J. P. Donoghue, \u201cCollective dynamics in human and monkey sensori-\n\nmotor cortex: predicting single neuron spikes,\u201d Nat Neurosci, vol. 13, no. 1, pp. 105\u2013111, 2010.\n\n[9] J. H. Macke, L. B\u00a8using, J. P. Cunningham, B. M. Yu, K. V. Shenoy, and M. Sahani., \u201cEmpirical models\n\nof spiking in neural populations,\u201d in Adv in Neural Info Proc Sys, vol. 24, 2012.\n\n[10] D. Pfau, E. A. Pnevmatikakis, and L. Paninski, \u201cRobust learning of low-dimensional dynamics from large\n\nneural ensembles,\u201d in Adv in Neural Info Proc Sys, pp. 2391\u20132399, 2013.\n\n[11] V. Mante, D. Sussillo, K. V. Shenoy, and W. T. Newsome, \u201cContext-dependent computation by recurrent\n\ndynamics in prefrontal cortex,\u201d Nature, vol. 503, pp. 78\u201384, Nov. 2013.\n\n[12] A. Fairhall, \u201cThe receptive \ufb01eld is dead. long live the receptive \ufb01eld?,\u201d Curr Opin Neurobiol, vol. 25,\n\npp. ix\u2013xii, 2014.\n\n[13] I. M. Park, E. W. Archer, N. Priebe, and J. W. Pillow, \u201cSpectral methods for neural characterization using\n\ngeneralized quadratic models,\u201d in Adv in Neural Info Proc Sys 26, pp. 2454\u20132462, 2013.\n\n[14] J. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, E. J. Chichilnisky, and E. P. Simoncelli, \u201cSpatio-\ntemporal correlations and visual signalling in a complete neuronal population,\u201d Nature, vol. 454, no. 7207,\npp. 995\u2013999, 2008.\n\n[15] J. W. Pillow and E. P. Simoncelli, \u201cDimensionality reduction in neural models: an information-theoretic\ngeneralization of spike-triggered average and covariance analysis,\u201d J Vis, vol. 6, no. 4, pp. 414\u201328, 2006.\n[16] E. H. Adelson and J. R. Bergen, \u201cSpatiotemporal energy models for the perception of motion,\u201d J Opt Soc\n\n[17] A. C. Smith and E. N. Brown, \u201cEstimating a state-space model from point process observations,\u201d vol. 15,\n\n[18] J. E. Kulkarni and L. Paninski, \u201cCommon-input models for multiple neural spike-train data,\u201d Network,\n\nAm A, vol. 2, no. 2, pp. 284\u201399, 1985.\n\nno. 5, pp. 965\u2013991, 2003.\n\nvol. 18, no. 4, pp. 375\u2013407, 2007.\n\n[19] A. P. Dempster, N. M. Laird, and D. B. Rubin, \u201cMaximum likelihood from incomplete data via the em\n\nalgorithm,\u201d J R Stat Soc Ser B, vol. 39, no. 1, pp. 1\u201338, 1977.\n\n[20] L. Paninski, Y. Ahmadian, D. Ferreira, S. Koyama, K. Rahnama Rad, M. Vidne, J. Vogelstein, and W. Wu,\n\n\u201cA new look at state-space models for neural data,\u201d vol. 29, pp. 107\u2013126, 2010.\n\n[21] A. Z. Mangion, K. Yuan, V. Kadirkamanathan, M. Niranjan, and G. Sanguinetti, \u201cOnline variational in-\nference for state-space models with point-process observations,\u201d Neural Comput, vol. 23, no. 8, pp. 1967\u2013\n1999, 2011.\n\n[22] M. Emtiyaz Khan, A. Aravkin, M. Friedlander, and M. Seeger, \u201cFast dual variational inference for non-\n\nconjugate latent gaussian models,\u201d in Proceedings of ICML, 2013.\n\n[23] Z. Ghahramani and G. E. Hinton, \u201cParameter estimation for linear dynamical systems,\u201d Univ. Toronto\n\n[24] J. H. Macke, G. Zeck, and M. Bethge, \u201cReceptive \ufb01elds without spike-triggering,\u201d in Adv in Neural Info\n\nTech Report, vol. 6, no. CRG-TR-96-2, 1996.\n\nProc Sys, vol. 20, pp. 969\u2013976, 2008.\n\n[25] U. K\u00a8oster, J. Sohl-Dickstein, C. M. Gray, and B. A. Olshausen, \u201cModeling higher-order correlations\n\nwithin cortical microcolumns,\u201d PLoS Comput Bio, vol. 10, no. 7, p. e1003684, 2014.\n\n[26] A. B. Graf, A. Kohn, M. Jazayeri, and J. A. Movshon, \u201cDecoding the activity of neuronal populations in\n\nmacaque primary visual cortex,\u201d Nature neuroscience, vol. 14, no. 2, pp. 239\u2013245, 2011.\n\n[27] V. Mountcastle, \u201cModality and topographic properties of single neurons of cat\u2019s somatic sensory cortex,\u201d\n\nJ Neurophysiol, 1957.\n\n[28] M. Vidne, Y. Ahmadian, J. Shlens, J. Pillow, J. Kulkarni, A. Litke, E. Chichilnisky, E. Simoncelli, and\nL. Paninski, \u201cModeling the impact of common noise inputs on the network activity of retinal ganglion\ncells,\u201d J Comput Neurosci, 2011.\n\n9\n\n\f", "award": [], "sourceid": 243, "authors": [{"given_name": "Evan", "family_name": "Archer", "institution": "MPI for Biological Cybernetics"}, {"given_name": "Urs", "family_name": "Koster", "institution": "University of California at Berkely"}, {"given_name": "Jonathan", "family_name": "Pillow", "institution": "UT Austin"}, {"given_name": "Jakob", "family_name": "Macke", "institution": "MPI for Biological Cybernetics"}]}