{"title": "On the relations of LFPs & Neural Spike Trains", "book": "Advances in Neural Information Processing Systems", "page_first": 2060, "page_last": 2068, "abstract": "One of the goals of neuroscience is to identify neural networks that correlate with important behaviors, environments, or genotypes. This work proposes a strategy for identifying neural networks characterized by time- and frequency-dependent connectivity patterns, using convolutional dictionary learning that links spike-train data to local field potentials (LFPs) across multiple areas of the brain. Analytical contributions are: (i) modeling dynamic relationships between LFPs and spikes; (ii) describing the relationships between spikes and LFPs, by analyzing the ability to predict LFP data from one region based on spiking information from across the brain; and (iii) development of a clustering methodology that allows inference of similarities in neurons from multiple regions. Results are based on data sets in which spike and LFP data are recorded simultaneously from up to 16 brain regions in a mouse.", "full_text": "On the Relationship Between LFP & Spiking Data\n\nDavid E. Carlson1, Jana Schaich Borg2, Kafui Dzirasa2, and Lawrence Carin1\n\n1Department of Electrical and Computer Engineering\n2Department of Psychiatry and Behavioral Sciences\n\n{david.carlson, jana.borg, kafui.dzirasa, lcarin}@duke.edu\n\nDuke University\nDuham, NC 27701\n\nAbstract\n\nOne of the goals of neuroscience is to identify neural networks that correlate with\nimportant behaviors, environments, or genotypes. This work proposes a strategy\nfor identifying neural networks characterized by time- and frequency-dependent\nconnectivity patterns, using convolutional dictionary learning that links spike-train\ndata to local \ufb01eld potentials (LFPs) across multiple areas of the brain. Analytical\ncontributions are: (i) modeling dynamic relationships between LFPs and spikes;\n(ii) describing the relationships between spikes and LFPs, by analyzing the ability\nto predict LFP data from one region based on spiking information from across the\nbrain; and (iii) development of a clustering methodology that allows inference\nof similarities in neurons from multiple regions. Results are based on data sets in\nwhich spike and LFP data are recorded simultaneously from up to 16 brain regions\nin a mouse.\n\nIntroduction\n\n1\nOne of the most fundamental challenges in neuroscience is the \u201clarge-scale integration problem\u201d:\nhow does distributed neural activity lead to precise, uni\ufb01ed cognitive moments [1]. This paper seeks\nto examine this challenge from the perspective of extracellular electrodes inserted into the brain. An\nextracellular electrode inserted into the brain picks up two types of signals: (1) the local \ufb01eld poten-\ntial (LFP), which represents local oscillations in frequencies below 200 Hz; and (2) single neuron\naction potentials (also known as \u201cspikes\u201d), which typically occur in frequencies of 0.5 kHz. LFPs\nrepresent network activity summed over long distances, whereas action potentials represent the pre-\ncise activity of cells near the tip of an electrode. Although action potentials are often treated as the\n\u201ccurrency\u201d of information transfer in the brain, relationships between behaviors and LFP activity\ncan be equally precise, and sometimes even more precise, than those with the activity of individual\nneurons [2,3]. Further, LFP network disruptions are highly implicated in many forms of psychiatric\ndisease [4]. This has led to much interest in understanding the mechanisms of how LFPs and action\npotentials interact to create speci\ufb01c types of behaviors. New multisite recording techniques that\nallow simultaneous recordings from a large number of brain regions provide unprecedented oppor-\ntunities to study these interactions. However, this type of multi-dimensional data poses signi\ufb01cant\nchallenges that require new analysis techniques.\nThree of the most challenging characteristics of multisite recordings are that: 1) the networks they\nrepresent are dynamic in space and time, 2) subpopulations of neurons within a local area can have\ndifferent functions and may therefore relate to LFP oscillations in speci\ufb01c ways, and 3) different fre-\nquencies of LFP oscillations often relate to single neurons in speci\ufb01c ways [5]. Here new models are\nproposed to examine the relationship between neurons and neural networks that accommodate these\ncharacteristics. First, each LFP in a brain region is modeled as convolutions between a bounded-time\ndictionary element and the observed spike trains. Critically, the convolutional factors are allowed\nto be dynamic, by binning the LFP and spike time series, and modeling the dictionary element for\n\n1\n\n\feach bin of the time series. Next, a clustering model is proposed making each neuron\u2019s dictionary\nelement a scaled version of an autoregressive template shared among all neurons in a cluster. This\nallows one to identify sub-populations of neurons that have similar dynamics over their functional\nconnectivity to a brain region. Finally, we provide a strategy for exploring which frequency bands\ncharacterize spike-to-LFP functional connectivity. We show, using two novel multi-region electro-\nphysiology datasets from mice, how these models can be used to identify coordinated interactions\nwithin and between different neuronal subsystems, de\ufb01ned jointly by the activity of single cells and\nLFPs. These methods may lead to better understanding of the relationship between brain activity\nand behavior, as well as the pathology underlying brain diseases.\n2 Model\n2.1 Data and notation\nThe data used here consists of multiple LFP and spike-train time series, measured simultaneously\nfrom M regions of a mouse brain. Spike sorting is performed on the spiking data by a VB imple-\nmentation of [6], from which J single units are assumed detected from across the multiple regions\n(henceforth we refer to single units as \u201cneurons\u201d); the number of observed neurons J depends on\nthe data considered, and is inferred as discussed in [6]. Since multiple microwires are inserted into\nsingle brain regions in our experiments (described in [7]), we typically detect between 4-50 neurons\nfor each of the M regions in which the microwires are inserted (discussed further when presenting\nresults). The analysis objective is to examine the degree to which one may relate (predict) the LFP\ndata from one brain region using the J-neuron spiking data from all brain regions. This analysis al-\nlows the identi\ufb01cation of multi-site neural networks through the examination of the degree to which\nneurons in one region are predictive of LFPs in another.\nLet x \u2208 RT represent a time series of LFP data measured from a particular brain region. The T\nsamples are recorded on a regular grid, with temporal interval \u2206. The spike trains from J differ-\nent neurons (after sorting) are represented by the set of vectors {y1, . . . , yJ}, binned in the same\nmanner temporally as the LFP data. Each yj \u2208 ZT\n+ is re\ufb02ective of the number of times neuron\nj \u2208 {1, . . . , J} \ufb01red within each of the T time bins, where Z+ represents nonnegative integers.\nIn the proposed model LFP data x are represented as a superposition of signals associated with each\nneuron yj, plus a residual that captures LFP signal unrelated to the spiking data. The contribution\nto x from information in yj is assumed generated by the convolution of yj with a bounded-time\ndictionary element dj (residing within the interval -L to L, with L (cid:28) T ). This model is related to\nconvolutional dictionary learning [8], where the observed (after spike sorting) signal yj represents\nthe signal we convolve the learned dictionary dj against.\nWe model dj as time evolving, motivated by the expectation that neuron j may contribute differently\nto speci\ufb01ed LFP data, based upon the latent state of the brain (which will be related to observed\nanimal activity). The time series x is binned into a set of B equal-size contiguous windows, where\nx = vec([x1, . . . , xB]), and likewise y = vec([yj1, . . . , yjB]). The dictionary element for neuron\nj is similarly binned as {dj1, . . . , djB}, and the contribution of neuron j to xb is represented as a\nconvolution of djb and yjb. This bin size is a trade-off between how \ufb01nely time is discretized and\nthe computational costs.\nIn the experiments, in one example the bins are chosen to be 30 seconds wide (novel-environment\ndata) and in the other 1 minute (sleep-cycle data), and these are principally chosen for computational\nconvenience (the second data set is nine times longer). Similar results were found with windows as\nnarrow as 10 second, or as wide as 2 minutes.\n2.2 Modeling the LFP contribution of multiple neurons jointly\nGiven {y1, . . . , yJ}, the LFP voltage at time window b is represented as\n\nxb =\n\nyjb \u2217 djb + \u0001b\n\n(1)\n\nJ(cid:88)\n\nj=1\n\nwhere \u2217 represents the convolution operator. Let Dj = [dj1, . . . , djB] \u2208 R(2L+1)\u00d7B represent\nthe sequence of dictionary elements used to represent the LFP data over the B windows, from the\nperspective of neuron j. We impose the clustering prior\n\nDj = \u03b6jAj, Aj \u223c G, G \u223c DP(\u03b2, G0)\n\n(2)\n\n2\n\n\fwhere G is a draw from a Dirichlet process (DP) [9, 10], with scale parameter \u03b2 > 0 and base\nprobability measure G0. Note that we cluster the shape of the dictionary elements, and each neuron\nhas its own scaling \u03b6 \u2208 R. Concerning the base measure, we impose an autoregressive prior on the\ntemporal dynamics, and therefore G0 is de\ufb01ned by an AR(\u03b1, \u03b3) process\n\nk=1 \u03c0k\u03b4A\u2217\n\nk\n\nh1, . . . , a\u2217\n\nab = \u03b1ab\u22121 + \u03bdt, \u03bdt \u223c N (0, \u03b3\u22121I)\n\nhB), with G =(cid:80)\u221e\n\n(3)\n(cid:81)\nwhere I is the identity matrix. This AR prior is used to constitute the B columns of the DP \u201catoms\u201d\nA\u2217\nh = (a\u2217\n. The elements of the vector \u03c0 = (\u03c01, \u03c02, . . . ) are\ni<h(1 \u2212 Vi) with Vh \u223c Beta(1, \u03b2). We\ndrawn from the \u201cstick-breaking\u201d [9] process \u03c0h = Vh\nplace the prior Gamma(a\u03b2, b\u03b2) on \u03b2, and priors Uniform(0,1) and Gamma(a\u03b3, b\u03b3) respectively on\n\u03b1 and \u03b3. To complete the model, we place the prior N (0, \u03c4\u22121I) on \u0001b, and \u03b6j \u223c N (0, 1).\nIn the implementation, a truncated stick-breaking representation is employed for G, using K \u201csticks\u201d\n(VK = 1), which simpli\ufb01es the implementation and has been shown to be effective in practice [9] if\nK is made large enough, and the size of K is inferred during the inference algorithm.\nSpecial cases of this model are clear. For example, if the Aj are simply drawn i.i.d. from G0, rather\nthan from the DP, each neuron is allowed to contribute its own unique dictionary shape to represent\nxb, called a non-clustering model in the results. In [11] the authors considered a similar model, but\nthe time evolution of dj was not considered (each neuron was assumed to contribute in the same way\nto represent the LFP, independent of time). Further, in [11] only a single neuron was considered, and\ntherefore no clustering was considered. A multi-neuron version of this model is inferred by setting\nB = 1.\n3\n3.1 Mean-\ufb01eld Variational Inference\nLetting \u0398 = {z, \u03b6, A1,...,K, V1,...,K, \u03b2, \u03b1, \u03b3}, the full likelihood of the clustering model\n\nInference\n\nB(cid:89)\n\nJ(cid:89)\n\nK(cid:89)\n\np(x, \u0398) =\n\n[p(xb|\u0398)]\n\n[p(zj|\u03c0)p(\u03b6j)]\n\n[p(A\u2217\n\nk|\u03b1, \u03b3)p(Vk|\u03b2)] p(\u03b2, \u03b1, \u03b3)\n\n(4)\n\nb=1\n\nj=1\n\nk=1\n\nThe non-clustering model can be recovered by setting zj = \u03b4j and the truncation level in the stick-\nbreaking process K to J. The time-invariant model is recovered by setting the number of bins B\nto 1, with or without clustering. The model of [11] is recovered by using a single bin and a single\nneuron.\nMany recent methods [12, 13] have been proposed to provide quick approximations to the Dirichlet\nprocess mixture model. Critically, in these models the latent assignment variables are conditionally\nindependent when the DP parameters are given. However, in the proposed model this assumption\ndoes not hold because the observation x is the superposition of the convolved draws from the Dirich-\nlet process.\nA factorized variational distribution q is proposed to approximate the posterior distribution, and\nthe non-clustering model arises as a special case of the clustering model. The inference to \ufb01t the\ndistribution q is based on Bayesian Hierarchical Clustering [13] and the VB Dirichlet Process Split-\nMerge method [12]. The proposed model does not \ufb01t in either of these frameworks, so a method\nto learn K by merging clusters by adapting [12, 13] is presented in Section 3.1.1. The factorized\ndistribution q takes the form:\n\n(cid:34)\n\n(cid:89)\n\n(cid:89)\n\n(cid:35)\n\n(cid:89)\n\nq(\u0398) =\n\nq(zj)\n\nq(\u03b6jk)\n\nq(\u03b2)q(\u03b1)q(\u03b3)\n\n[q(A\u2217\n\nk)q(Vk)]\n\n(5)\n\nj\n\nk\n\nk\n\n\u03b3, b(cid:48)\n\n\u03b3), q(\u03b1) = N(0,1)(\u02c6\u03b1, \u03b7\u22121\n\n\u03b1 ), q(Ak) = N (vec(Ak); vec(\u02c6ak1, . . . , \u02c6akB), \u039b\u22121\n\nStandard forms on these distributions are assumed, with q(zj) = Categorical(rj), q(\u03b3) =\nGamma(a(cid:48)\nk ), \u03a3k =\n\u039bk, and q(\u03b2) = Gamma(a(cid:48)\n\u03b2). To facilitate inference, the distribution on \u03b6j is split into\nq(\u03b6jk) = N (\u00b5jk, \u03b7\u22121\njk ), the variational distribution for \u03b6 on the jth spike train given that it is in\ncluster k. The non-clustering model can be represented as a special case of the clustering model\nwhere q(\u03b6jk) = \u03b41, and q(zj) = \u03b4j. As noted in [12], this factorized posterior has the property that\na q with K(cid:48) clusters is nested in a representation of q for K clusters for K \u2265 K(cid:48), so any number of\nclusters up to K(cid:48) is represented.\n\n\u03b2, b(cid:48)\n\n3\n\n\fVariational algorithms \ufb01nd a q that minimizes the KL divergence from the true, intractable posterior\n[14], \ufb01nding a q that locally maximizes the evidence lower bound (ELBO) objective:\n\n1,...,K, \u03b2, \u03b1, \u03b3)]\n\nb = xb \u2212(cid:80)\n\nlog p(x|\u0398) \u2265 L(q) = Eq[log p(x, z, \u03b6, A\u2217\n\n1,...,K, \u03b2, \u03b1, \u03b3|\u0398) \u2212 log q(z, \u03b6, A\u2217\n(cid:80)Tb\nj(cid:48)(cid:54)=j yb \u2217 ((cid:80)\n\n(6)\nTo facilitate inference, approximations to p(y|\u0398) are developed. Let Tb be the number of time\npoints in bin b, and de\ufb01ne Rjib \u2208 R(2L+1)\u00d7(2L+1) with entries Rjib,ik = 1\nt=1 yjb,tyib,t+k\u2212i;\n(cid:80)Tb\nyjb,t is yj at time point t in window/bin b. Let x\nk rjk\u00b5jk \u02c6akb), or\nthe residual after all but the contribution from the jth neuron have been removed, and de\ufb01ne let\n\u2212j\nt=1 yjb,txb,t+i for i \u2208 {\u2212L, . . . , L}. Both Rjb and \u03bdjb\njb \u2208 R2L+1 with entries \u03bdj\n\u03bd\n\u2212j\nb |yjb, djb) =\ncan be ef\ufb01ciently estimated with the FFT. For each time bin b, we can write: log p(x\n\u2212j\nconst \u2212 \u03c4\njb )T djb)\nk\u2217 \u02c6akb. \u03a3kbb(cid:48) denotes\nTo de\ufb01ne the key updates, let y(cid:48)\nthe block in \u03a3k indexing the b and the b(cid:48) bins, which is ef\ufb01ciently calculated because \u03a3\u22121\nis a block\ntri-diagonal matrix from the \ufb01rst-order autoregressive process, and explicit equations exist. Letting\n\u02c6N(cid:48)\nk. For q(\u03b6jk), the\n\u2212j\nkb + \u03a3kbb)), and \u00b5jk = \u03b7\u22121\njb .\n\nb,t \u2212(cid:80)L\nb = xb\u2212(cid:80)\nj rjk, then q(Vk) is updated by are ak = 1 + \u02c6N, bk = \u02c6\u03b2 +(cid:80)K\n\nkb =(cid:80)\n\u02c6Nk =(cid:80)\nparameters are updated \u03b7jk = 1 +(cid:80)\n\n(cid:96)=\u2212L yj,b,t+(cid:96)dj,b,\u2212(cid:96))2 (cid:39) const \u2212 \u03c4 Tb\n2 (dT\nj rjk\u00b5jkyjb, and x\u2212k\n\njbRjjbd \u2212 2(\u03bd\nj(cid:48)(cid:54)=j y(cid:48)\n\nb trace(Rjb(\u02c6akb \u02c6aT\n\nji = 1\nTb\n\n(cid:80)\n\nkbRjb\u03bd\n\nk(cid:48)=k+1\n\nb \u02c6aT\n\n2 (x\n\n\u2212j\n\n\u2212j\n\nTb\n\njk\n\nk\n\n(cid:88)\n\nThe clustering latent variables are updated sequentially by:\nlog(rjk) \u221d \u2212 \u03c4\n2\ncan be used to calculate q(A\u2217\n\njk )tr(Rjb(Tb\u03a3kbb+ \u02c6akb \u02c6aT\n\n[(\u00b5jk +\u03b7\u22121\n\nb\n\nkb))\u22122\u00b5jk(x\n\n\u2212j\nb )T (ybRjbb \u02c6akb))]+Eq[q(\u03c0)]\n\nb\n\nb\n\nk\n\nand y\u2212k\n\nk). The mean of the distribution q(Ak) is evaluated\nis a block tridiagonal matrix,\nk) are found in Section A of the\n\nx\u2212k\nusing the forward \ufb01ltering-backward smoothing algorithm, and \u03a3\u22121\nenabling ef\ufb01cient computations. Further details on updating q(A\u2217\nSupplemental Material. Approximating distributions q(\u03b2), q(\u03b1) and q(\u03b3) are standard [14, 15].\n3.1.1 Merge steps\nThe model is initialized to K = J clusters and the algorithm \ufb01rst \ufb01nds q for the non-clustering\nmodel. This initialization is important because of the superposition measurement model. The algo-\nrithm proceeds to merge down to K(cid:48), where K(cid:48) is a local mode of the VB algorithm. The procedure\nis as follows: (i) Randomly choose two clusters k and k(cid:48) to merge.\n(ii) Propose a new varia-\ntional distribution \u02dcq with K \u2212 1 clusters. (iii) Calculate the change in the variational lower bound,\nL(\u02dcq) \u2212 L(q), and accept the merge if the variational lower bound increases. As in [12], intelligent\nsampling of k and k(cid:48) signi\ufb01cantly improves performance. Here, we sample k and k(cid:48) with weight\nproportional to exp(\u2212K(Ak, Ak(cid:48); c0)), where K(\u00b7,\u00b7; c0) is the radial basis function.\nIn [13] all\npairwise clusterings were considered, but that is computationally infeasible in this problem. This\napproach for merging clusters is similar to that developed in [12].\nThis algorithm requires ef\ufb01cient estimation of the difference in the lower bound. For a proposed k\nand k(cid:48), a new variational distribution \u02dcq is proposed, with \u02dcq(zj = k) = q(zj = k) + q(zj = k(cid:48))\n\u02c6Nk(cid:63) ), q(\u03b2k(cid:48)) = \u03b40, and\nk rjk log rjk, the difference in the lower bound can\np(Ak|\u03b1, \u03b3)\n\nand \u02dcq(zj = k(cid:48)) = 0, \u02dcq(\u03b2k) = Beta(a0 + \u02c6Nk + \u02c6Nk(cid:48), b0 +(cid:80)K,k(cid:63)(cid:54)=k(cid:48)\nq(Ak) is calculated. Letting H(q) = \u2212(cid:80)\n\n(cid:80)\n\nk(cid:63)=k+1\n\nj\n\nbe calculated:\nL(\u02dcq) \u2212 L(q) = E\u02dcq\n\nlog p(y|A1,...,K, \u03b6, \u03c4 )\n\n\u2212 H(\u02dcq) + H(\u02dcp)\n\n(7)\n\n(cid:21)\np(\u03b2k))\nq(\u03b2k)\np(Ak|\u03b1, \u03b3)p(A(cid:48)\nk|\u03b1, \u03b3)\nq(Ak)q(A(cid:48)\nk)\n\n\u02dcq(Ak)\n\n(cid:21)\n\n\u2212 Eq\n\nlog p(y|A1,...,K, \u03b6, \u03c4 )\n\np(\u03b2k)p(\u03b2k(cid:48))\nq(\u03b2k)q(\u03b2k(cid:48))\n\n+ H(q) \u2212 H(p)\n\nExplicit details on the calculations of these variables are found in Section A of the Supplementary\nMaterial, and the block tridiagonal nature of \u039bk allows the complete calculation of this value in\nO(BTb(( \u02c6Nk + \u02c6Nk(cid:48)) + L3)). This is linear in the amount of data used in the model. The algorithm\nis stopped after 10 merges in a row are rejected.\n3.2 Integrated Nested Laplacian Approximation for the Non-Clustering Model\nThe VB inference method assumes a separable posterior. In the non-clustering model, Integrated\nNested Laplacian Approximation (INLA) [16] was used to estimate of the joint posterior, without\n\n4\n\n(cid:20)\n(cid:20)\n\n\fAnimal\n\n1\n2\n3\n4\n5\n6\n\nInvariant Non-Cluster Clustering\n0.1394\n0.1465\n0.2251\n0.0867\n0.1238\n0.0675\n\n0.2094\n0.2340\n0.3414\n0.1434\n0.1882\n0.1351\n\n0.1968\n0.2382\n0.3050\n0.1433\n0.1867\n0.1407\n\nAnimal\n\n7\n8\n9\n10\n11\n\nInvariant Non-Cluster Clustering\n0.2442\n0.1385\n0.3182\n0.0902\n0.2362\n0.1597\n0.0865\n0.0311\n0.1161\n0.675\n\n0.2567\n0.3440\n0.1881\n0.0803\n0.1064\n\nTable 1: Mean held-out RFE of the multi-cell models predicting the Hippocampus LFP. \u201cInvariant\u201d denotes the\ntime-invariant model, \u201cNon-cluster\u201d and \u201cclustering\u201d denote the dynamic model without and with clustering.\n\nFigure 1: (Left) Mean single-cell holdout RFE predicting mouse 3\u2019s Nucleus Accumbens LFP comparing the\ndynamic and time-invariant model. Each point is a single neuron. (Middle) Convolutional dictionary for a\nVTA cell predicting mouse 3\u2019s Nucleus Accumbens LFP at 5 minutes, 15 minutes, and 38 minutes after the\nexperiment start. (Right) Hold-out RFE over experiment time with the time-invariant, non-clustering, and the\nclustering model to predict mouse 3\u2019s Hippocampus LFP.\n\nassuming separability. Comparisons to INLA constitute an independent validation of VB, for in-\nference in the non-clustering version of the model. The INLA inference procedure is detailed in\nSupplemental Section B. INLA inference was found to be signi\ufb01cantly slower than the VB approxi-\nmation, so experimental results below are shown for VB. The INLA and VB predictive performance\nwere quantitatively similar for the non-clustering model, providing con\ufb01dence in the VB results.\n4 Experiments\n4.1 Results on Mice Introduced to a Novel Environment\nThis data set is from a group of 12 mice consisting of male Clock-\u220619 (mouse numbers 7-12)\nand male wild-type littermate controls (mouse numbers 1-6) (further described in [7]). For each\nanimal, 32-48 total microwires were implanted, with 6-16 wires in each of the Nucleus Accumbens,\nHippocampus (HP), Prelimbic Cortex (PrL), Thalamus, and the Ventral Tegmental Area (VTA).\nLFPs were averaged over all electrodes in an area and \ufb01ltered from 3-50Hz and sampled at 125\nHz. Neuronal activity was recorded using a Multi-Neuron Acquisition Processor (Plexon). 99-192\nindividual spike trains (single units) were detected per animal. In this dataset animals begin in their\nhome cage, and after 10 minutes are placed in a novel environment for 30 minutes. For analysis, this\n40 minute data sequence was binned into 30 second chunks, giving 80 bins. For all experiments we\nchoose L such that the dictionary element covered 0.5 seconds before and after each spike event.\nCross-validation was performed using leave-one-out analysis over time bins, using the error metric\nof reduction in fractional error (RFE), 1 \u2212 ||xb \u2212 \u02c6xb||2\n2. Figure 1(left) shows the average\nhold-out RFE for the time-invariant model and the dynamic model for single spike train predicting\nmouse 3\u2019s Nucleus Accumbens, showing that the dynamic model can give strong improvements\non the scale of a single cell (these results are typical). The dynamic model has a higher hold-out\nRFE on 98.4% of detected cells across all animals and all regions, indicating that the dynamic\nmodel generally outperforms the time-invariant model. A dynamic dictionary element from a VTA\ncell predicting mouse 3\u2019s Nucleus Accumbens is shown in Figure 1(middle). At the beginning\nof the experiment, this cell is linked with a slow, high-amplitude oscillation. After the animal is\ninitially placed into a new environment (illustrated by the 15-minute data point), the amplitude of the\ndictionary element drops close to zero. Once the animal becomes accustomed to its new environment\n(illustrated by the 38-minute data point), the cell\u2019s original periodic dictionary element begins to\nappear again. This example shows how cells and LFPs clearly have time-evolving relationships.\nThe leave-one-out performance of the time-invariant, non-clustering, and clustering models predict-\ning animal 3\u2019s Hippocampus LFP with 182 neurons is shown in Figure 1(right). These results show\n\n2/||xb||2\n\n5\n\n00.010.020.030.0400.010.020.030.04Time-InvariantDynamicSingleNeuronHold-outRFE\u22120.500.5\u22120.1\u22120.0500.050.1Time,secondsAmplitude,a.u.DictionaryElementofaVTACell5Min15Min38Min01020304000.10.20.30.40.5ExperimentTime,MinutesHold-outRFEJointModelPredictioninHPInvariantNon-ClusterClustering\fFigure 2: Example clusters predicting mouse 3\u2019s Hippocampus LFP. The top part shows the convolutional fac-\ntor throughout the duration of the experiment, and the bottom part shows the location of the cells in the cluster.\nSome of the clusters are dynamic whereas others were consistent through the duration of the experiment.\n\nFigure 3:\n(Left) RFE as a function of time bin and frequency bin for all Hippocampus cells predicting the\nThalamus LFP. There is a change in the predictive properties around 10 minutes. (Middle) Total energy versus\nthe unexplained residual for the Hippocampus cells predicting the Thalamus LFP for the frequency band 25-35\nHz. (Right) RFE using only the cluster of cells shown in Figure 2(right).\n\nthat predictability changes over time, and indicate that there is a strong increase in LFP predictability\nwhen the mouse is placed in the novel environment. Using dynamics improves the results dramat-\nically, and the clustering hold-out results showed further improvements in hold-out performance.\nThe mean hold-out RFE results for the Hippocampus for 11 animals are shown in Table 1 (1 animal\nwas missing this region recording). Results for other regions are shown in Supplemental Tables 1,\n2, 3, and 4, and show similar results.\nIn this dataset, there is little quantitative difference between the clustering and non-clustering mod-\nels; however, the clustering result is much better for interpretation. One reason for this is that\nspike-sorting procedures are notoriously imprecise, and often under- or over-cluster. A clustering\nmodel with equivalent performance is evidence that many neurons have the same shapes and dy-\nnamics, and repeated dynamic patterns reduces concerns that dynamics are the result of failure to\ndistinguish distinct neurons. Similarly, clustering of neuron shapes in a single electrode could be\nthe result of over-clustering from the spike-sorting algorithm, but clustering across electrodes gives\nstrong evidence that truly different neurons are clustering together. Additionally, neural action po-\ntential shapes drift over time [6, 17], but since cells in a cluster come from different electrodes and\nregions, this is strong evidence that the dynamics are not due to over-sorting drifting neurons.\nEach cluster has both a dynamic shape result as well as well as a neural distribution over regions.\nExample clustering shapes and histogram cell locations for clusters predicting mouse 3\u2019s Thalamus\nLFP are shown in Figure 2. The top part of this \ufb01gure shows the base dictionary element evolution\nover the duration of the experiment. Note that both the (left) and (middle) plots show a dynamic\neffect around 10 minutes, and the cells primarily come from the Ventral Tegmental Area. The (right)\nplot shows a fairly stable factor, and its cells are mostly in the Hippocampus region.\nThe ability to predict the LFP constitutes functional connectivity between a neuron and the neuronal\ncircuit around the electrode for the LFP [18]. Neural circuits have been shown to transfer information\nthrough speci\ufb01c frequencies of oscillations, so it is of scienti\ufb01c interest to know the functional\nconnectivity of a group of neurons as a function of frequency [5]. Frequency relationships were\nexplored by \ufb01ltering the LFP signal after the predicted signal has been removed, using a notch \ufb01lter\nat 1 Hz intervals with a 1 Hz bandwidth, and the RFE was calculated for each held-out time bin and\nfrequency bin.\nAll cells in the Thalamus were used to predict each frequency band in mouse 3\u2019s Hippocampus LFP,\nand this result is shown in Figure 3(left). This \ufb01gure shows an increase in RFE of the 25-35 Hz band\nafter the animal has been moved to a new location. The RFE on the band from 25-35 Hz is shown\n\n6\n\nExperiment Time, MinutesDictionary, SecondsCluster Factor Evolution 10203040\u22120.4\u22120.200.20.4\u22120.0200.02AccumbensHPPrLThalamusVTA0510Number of CellsCluster\u2019s Cell LocationsExperiment Time, MinutesDictionary, SecondsCluster Factor Evolution 10203040\u22120.4\u22120.200.20.4\u22120.0200.02AccumbensHPPrLThalamusVTA0246Number of CellsCluster\u2019s Cell LocationsExperiment Time, MinutesDictionary, SecondsCluster Factor Evolution 10203040\u22120.4\u22120.200.20.4\u22120.0500.05AccumbensHPPrLThalamusVTA01020Number of CellsCluster\u2019s Cell LocationsExperimentalTime,minFrequency,HzHippocampusCellsPredictingThalamusLFP10203040813182328333843RFE0.10.20.30.40.50.60102030400100200300400500600HippocampusCellsPredictingThalamusLFP25-35HzExperimentalTime,minEnergy,a.u.RawEnergyResidualExperimentalTime,minFrequency,HzClusterContribution10203040813182328333843RFE0.050.10.150.2\fRegion\n\nTime-Invariant\nNon-Clustering\n\nClustering\n\nRegion\n\nTime-Invariant\nNon-Clustering\n\nClustering\n\nPrLCx\n0.1055\n0.1686\n0.1749\nSubnigra\n0.1309\n0.1939\n0.1950\n\nMOFCCx NAcShell NAcCore\n0.1076\n0.1796\n0.1814\nDLS\n0.1237\n0.1973\n0.2012\n\n0.1366\n0.1972\n0.2020\nOFC\n0.1878\n0.2695\n0.2723\nTable 2: Mean held-out RFE of the animal going through sleep cycles in each region.\n\nAmyg\n0.0883\n0.1422\n0.1390\nDMS\n0.1518\n0.2363\n0.2378\n\nHipp\n0.2091\n0.2662\n0.2798\n\nM1\n\n0.1350\n0.2034\n0.2080\n\n0.0904\n0.1599\n0.1609\nLHb\n0.1240\n0.1801\n0.1813\n\n0.1304\n0.1994\n0.2029\nThal\n0.1550\n0.2188\n0.2204\n\nV1\n\nVTA\n0.1317\n0.1907\n0.1923\nFrA\n0.1164\n0.1894\n0.1912\n\nFigure 4: The predictive patterns of individual neurons predicting multiple regions. (Left) A Hippocampus\ncell is the best single cell predictor of the V1 LFP (Middle) A V1 cell with a relationship only to the V1 LFP.\n(Right) A Nucleus Accumbens Shell cell that is equivalent in predictive ability to the best V1 cell.\n\nin Figure 3(middle), and shows that while the raw energy in this frequency band is much higher\nafter the move to the novel environment, the cells from the Hippocampus can explain much of the\nadditional energy in this band. In Figure 3(right), we show the same result using only the cluster in\nFigure 2. Note that there is a change around 10 minutes that is due to both a slight change in the\nconvolutional dictionary and a change in the neural \ufb01ring patterns.\n4.2 Results on Sleep Data Set\nThe second data set was recorded from one mouse going through different sleep cycles over 6 hours.\n64 microwires were implanted in 16 different regions of the brain, using the Prelimbix Cortex (PrL),\nMedial Orbital Frontal Cortex (MOFCCx), the core and shell of the Nucleus Accumbens (NAc),\nBasal Amygdala (Amy), Hippocampus (Hipp), V1, Ventral Tegmental Area (VTA), Substantia ni-\ngra (Subnigra), Medial Dorsal Thalamus (MDThal), Lateral Habenula (LHb), Dorsolateral Stria-\ntum (DLS), Dorsomedial Striatum (DMS), Motor Cortex (M1), Orbital Frontal Cortex (OFC), and\nFrontal Association Cortex (FrA). LFPs were averaged over all electrodes in an area and \ufb01ltered\nfrom 3-50Hz and sampled at 125Hz, and L was set to 0.5 seconds. 163 total neurons (single units)\nwere detected using spike sorting, and the data were split into 360 1-minute time bins. The leave-\none-out predictive performance was higher for the dynamic single cell model on 159 out of 163\nneurons predicting the Hippocampus LFP. The mean hold-out RFEs for all recorded regions of the\nbrain are shown in Table 2 for all models, and the clustering model is the best performing model in\n15 of the 16 regions.\nPreviously published work looked at the predictability of the V1 LFP signal from individual V1\nneurons [11,18,19]. Our experiments \ufb01nd that the dictionary elements for all V1 cells (4 electrodes,\n4 cells in this dataset) are time-invariant and match the single-cell time-invariant dictionary shape\nof [11]. The dictionary elements for a single V1 cell predicting multiple regions are shown in Figure\n4(middle; for simplicity, only a subset of brain regions recorded from are shown). This suggests that\nthe V1 cell has a connection to the V1 region, but no other brain region that was recorded from in\nthis model. However, cells in other brain regions showed functional connectivity to V1. The best\nindividual predictor is a cell in the Hippocampus shown in Figure 4(left). An additional example\ncell is a cell in the Nucleus Accumbens shell that has the same RFE as the best V1 cell, and its shape\nis shown in Figure 4(right).\nSleep states are typically de\ufb01ned by dynamic changes in functional connectivity across brain regions\nas measured by EEG (LFPs recorded from the scalp) [20], but little is known about how single neu-\nrons contribute to, or interact with, these network changes. To get sleep covariates, each second of\ndata was scored into \u201cawake\u201d or \u201csleep\u201d states using the methods in [21], and the sleep state was\naveraged over the time bin. We de\ufb01ned a time bin to be a sleep state if \u2265 95% of the individual sec-\n\n7\n\n\u22120.500.5\u22120.2\u22120.15\u22120.1\u22120.0500.050.10.15MeanFactorsforCellinHPTime,SecondsAmplitude,a.u.V1HPMDThalVTA\u22120.500.5\u22120.08\u22120.06\u22120.04\u22120.0200.020.04MeanFactorsforCellinV1Time,SecondsAmplitude,a.u.V1HPMDThalVTA\u22120.500.5\u22120.04\u22120.03\u22120.02\u22120.0100.010.020.03MeanFactorsforCellinNAcShellTime,SecondsAmplitude,a.u.V1HPMDThalVTA\fFigure 5: (Left) The cluster predicting the V1 region of the brain, matching known pattern for individual V1\ncells [11, 18]. (Middle,Right) Clusters predicting the motor cortex that show positive (pro) and negative (anti)\nrelationships between amplitude and sleep.\n\nFigure 6: Mean RFE when the animal is awake and when it is asleep. (Left) Cluster\u2019s convolution factor is\nstable, and shows only minor differences between sleep and awake prediction. (Middle and Right) Clusters\nshown in Figure 5 (left and right), depicting varying patterns with the mouse\u2019s sleep state\nonds are scored as a sleep state, and the animal is awake if \u2264 5% of the individual seconds are scored\nas a sleep state. In Figure 5(middle) we show a cluster that is most strongly positively correlated\nwith sleep (pro-sleep), and in Figure 5(right) we show a cluster that is most negatively correlated\nwith sleep (pro-awake). Both \ufb01gures show the neuron locations as well as the mean waveform shape\nduring sleep and wake. In this case, the pro-sleep cluster is dominantly Hippocampus cells and the\nanti-sleep cluster comes from many different regions. There may be concern that because these are\nthe maximally correlated clusters, that these results may be atypical. To address this concern, the\np-value for \ufb01nding a cluster this strongly correlated has a p-value 4 \u00d7 10\u22126 for Pearson correlation\nwith the Bonferroni correction for multiple tests. Furthermore, 4 of the 25 clusters detected showed\ncorrelation above .4 between amplitude and sleep state, so this is not an isolated phenomena.\nThe RFE changes as both a function of frequency and sleep state for some clusters of neurons. Using\n1Hz bandwidth frequency bins, in Figure 6 (middle and right) we show the mean RFE using only\nthe clusters in Figure 5 (middle and right). The cluster associated positively with sleeping shifts\nits frequency peak and increases its ability to predict when the animal is sleeping. Likewise, the\nsleep-decreased cluster performs worst at predicting when the animal is asleep. For comparison, in\nFigure 6 (left) we include the frequency results for cluster with a stable dictionary element. The\ntotal RFE is comparable and there is a not a dramatic shift in the peak frequency between the sleep\nand awake states.\n5 Conclusions\nNovel models and methods are developed here to account for time-varying relationships between\nneurons and LFPs. Within the context of our experiments, signi\ufb01cantly improved predictive perfor-\nmance is realized when one accounts for temporal dynamics in the neuron-LFP interrelationship.\nFurther, the clustering model reveals which neurons have similar relationships to a speci\ufb01c brain re-\ngion, and the frequencies that are predictable in the LFP change with known dynamics of the animal\nstate. In future work, these ideas can be incorporated with attempts to learn network structure, and\nLFPs can be considered a common input when exploring networks of neurons [19, 22, 23]. More-\nover, future experiments are being designed to place additional electrodes in a single brain region,\nwith the goal of detecting 100 neurons in a single brain region while recording LFPs in up to 20\nregions. The methods proposed here will facilitate exploration of both the diversity of neurons and\nthe differences in functional connectivity on an individual neuron scale.\nAcknowlegements The research reported here was funded in part by ARO, DARPA, DOE, NGA\nand ONR. We thank the reviewers for their helpful comments.\n\n8\n\n\u22120.4\u22120.200.20.4\u22120.1\u22120.0500.05Time,sAmplitude,a.u.ClusterpredictingV1Region024MOFCThalV1AmygVTAPrLNumberofCells\u22120.4\u22120.200.20.4\u22120.0500.05DictionaryElement,sAmplitude,a.u.Pro-SleepClusterAwakeSleep0510HPSubnigraNumberofCells\u22120.4\u22120.200.20.4\u22120.04\u22120.0200.020.04DictionaryElement,sAmplitude,a.u.Anti-SleepClusterAwakeSleep0246DLSMOFCFrAPrLNumberofCells0102030405000.0050.010.0150.020.025Frequency,HzMeanRFESleep-NeutralClusterRFEbyFrequencyAwakeSleep0102030405000.050.10.150.2Frequency,HzMeanRFESleep-IncreasedClusterRFEbyFrequencyAwakeSleep0102030405000.010.020.030.040.05Frequency,HzMeanRFESleep-DecreasedClusterRFEbyFrequencyAwakeSleep\fReferences\n[1] F. Varela, J.P. Lachaux, E. Rodriguez, and J. Martinerie. The brainweb: phase synchronization and large-\n\nscale integration. Nature Rev. Neuro., 2001.\n\n[2] B. Pesaran, J.S. Pezaris, M. Sahani, P.P. Mitra, and R.A. Andersen. Temporal structure in neuronal activity\n\nduring working memory in macaque parietal cortex. Nature neuroscience, 5:805\u2013811, 2002.\n\n[3] C. Mehring, J. Rickert, E. Vaadia, S. Cardosa de Oliveira, A. Aertsen, and S. Rotter. Inference of hand\nmovements from local \ufb01eld potentials in monkey motor cortex. Nature neuroscience, 6(12):1253\u20131254,\n2003.\n\n[4] P.J. Uhlhaas and W. Singer. Abnormal neural oscillations and synchrony in schizophrenia. Nature Rev.\n\nNeuro., 2010.\n\n[5] M. Le Van Quyen and A. Bragin. Analysis of dynamic brain oscillations: methodological advances.\n\nTrends in Neurosciences, 2007.\n\n[6] D.E. Carlson, J.T. Vogelstein, Q. Wu, W. Lian, M. Zhou, C.R. Stoetzner, D. Kipke, D. Weber, D.B.\nDunson, and L. Carin. Multichannel Electrophysiological Spike Sorting via Joint Dictionary Learning &\nMixture Modeling. IEEE TBME, 2013.\n\n[7] K Dzirasa, L Coque, MM Sidor, S Kumar, EA Dancy, JS Takahashi, C.A. McClung, and M.A.L. Nicolelis.\nLithium ameliorates nucleus accumbens phase-signaling dysfunction in a genetic mouse model of mania.\nJ. Neurosci., December 2010.\n\n[8] B. Chen, G. Polatkan, G. Sapiro, D. Dunson, and L. Carin. The hierarchical beta process for convolutional\n\nfactor analysis and deep learning. ICML, 2011.\n\n[9] H. Ishwaran and L.F. James. Gibbs Sampling Methods for Stick-Breaking Priors. JASA, March 2001.\n[10] T.S. Ferguson. A Bayesian Analysis of Some Nonparametric Problems. Annals Stat., March 1973.\n[11] M. Rasch, N.K. Logothetis, and G. Kreiman. From neurons to circuits: linear estimation of local \ufb01eld\n\npotentials. J. Neurosci., November 2009.\n\n[12] M.C. Hughes and E.B. Sudderth. Memoized Online Variational Inference for Dirichlet Process Mixture\n\nModels. NIPS, 2013.\n\n[13] K.A. Heller and Z. Ghahramani. Bayesian Hierarchical Clustering. ICML, 2005.\n[14] D.M. Blei and M.I. Jordan. Variational inference for Dirichlet process mixtures. Bayesian Analysis, 2006.\nIEEE Trans.\n[15] S.J. Roberts and W.D. Penny. Variational Bayes for generalized autoregressive models.\n\nSignal Process., September 2002.\n\n[16] H. Rue, S. Martino, and N. Chopin. Approximate Bayesian inference for latent Gaussian models by using\n\nintegrated nested Laplace approximations. J. Royal Stat. Soc., 2009.\n\n[17] A. Calabrese and L. Paniski. Kalman \ufb01lter mixture model for spike sorting of non-stationary data. J.\n\nNeurosci. Methods, 2010.\n\n[18] I. Nauhaus, L. Busse, M. Carandini, and D.L. Ringach. Stimulus contrast modulates functional connec-\n\ntivity in visual cortex. Nature Neuro., January 2009.\n\n[19] R.C. Kelly, M.A. Smith, R.E. Kass, and T.S. Lee. Local \ufb01eld potentials indicate network state and account\n\nfor neuronal response variability. J. Comp. Neurosci., December 2010.\n\n[20] L.J. Larson-Prior, J.M. Zempel, T.S. Nolan, F.W. Prior, A.Z. Snyder, and M.E. Raichle. Cortical network\n\nfunctional connectivity in the descent to sleep. PNAS, March 2009.\n\n[21] K. Dzirasa, S. Ribeiro, R. Costa, L.M. Santos, S.-C. Lin, A. Grosmark, T.D. Sotnikova, R.R. Gainetdinov,\nM.G. Caron, and M.A.L. Nicolelis. Dopaminergic control of sleep-wake states. J. Neurosci., October\n2006.\n\n[22] J.W. Pillow and P. Latham. Neural characterization in partially observed populations of spiking neurons.\n\nNIPS, 2007.\n\n[23] M.J. Rasch, A. Gretton, Y. Murayama, W. Maass, and N.K. Logothetis. Inferring spike trains from local\n\n\ufb01eld potentials. J. Neurophysiology, March 2008.\n\n[24] D.I. Kim, P. Gopalan, D.M. Blei, and E.B. Sudderth. Ef\ufb01cient Online Inference for Bayesian Nonpara-\n\nmetric Relational Models. NIPS, 2012.\n\n9\n\n\f", "award": [], "sourceid": 1111, "authors": [{"given_name": "David", "family_name": "Carlson", "institution": "Duke University"}, {"given_name": "Jana", "family_name": "Schaich Borg", "institution": "Duke University"}, {"given_name": "Kafui", "family_name": "Dzirasa", "institution": "Duke University"}, {"given_name": "Lawrence", "family_name": "Carin", "institution": "Duke University"}]}