{"title": "Interpretable Nonlinear Dynamic Modeling of Neural Trajectories", "book": "Advances in Neural Information Processing Systems", "page_first": 3333, "page_last": 3341, "abstract": "A central challenge in neuroscience is understanding how neural system implements computation through its dynamics. We propose a nonlinear time series model aimed at characterizing interpretable dynamics from neural trajectories. Our model assumes low-dimensional continuous dynamics in a finite volume. It incorporates a prior assumption about globally contractional dynamics to avoid overly enthusiastic extrapolation outside of the support of observed trajectories. We show that our model can recover qualitative features of the phase portrait such as attractors, slow points, and bifurcations, while also producing reliable long-term future predictions in a variety of dynamical models and in real neural data.", "full_text": "Interpretable Nonlinear Dynamic Modeling\n\nof Neural Trajectories\n\nYuan Zhao and Il Memming Park\n\nDepartment of Neurobiology and Behavior\n\n{yuan.zhao, memming.park}@stonybrook.edu\n\nDepartment of Applied Mathematics and Statistics\n\nInstitute for Advanced Computational Science\n\nStony Brook University, NY 11794\n\nAbstract\n\nA central challenge in neuroscience is understanding how neural system imple-\nments computation through its dynamics. We propose a nonlinear time series\nmodel aimed at characterizing interpretable dynamics from neural trajectories.\nOur model assumes low-dimensional continuous dynamics in a \ufb01nite volume. It\nincorporates a prior assumption about globally contractional dynamics to avoid\noverly enthusiastic extrapolation outside of the support of observed trajectories.\nWe show that our model can recover qualitative features of the phase portrait such\nas attractors, slow points, and bifurcations, while also producing reliable long-\nterm future predictions in a variety of dynamical models and in real neural data.\n\n1\n\nIntroduction\n\nContinuous dynamical systems theory lends itself as a framework for both qualitative and quanti-\ntative understanding of neural models [1, 2, 3, 4]. For example, models of neural computation are\noften implemented as attractor dynamics where the convergence to one of the attractors represents\nthe result of computation. Despite the wide adoption of dynamical systems theory in theoretical\nneuroscience, solving the inverse problem, that is, reconstructing meaningful dynamics from neural\ntime series, has been challenging. Popular neural trajectory inference algorithms often assume lin-\near dynamical systems [5, 6] which lack nonlinear features ubiquitous in neural computation, and\ntypical approaches of using nonlinear autoregressive models [7, 8] sometimes produce wild extrap-\nolations which are not suitable for scienti\ufb01c study aimed at con\ufb01dently recovering features of the\ndynamics that re\ufb02ects the nature of the underlying computation.\nIn this paper, we aim to build an interpretable dynamics model to reverse-engineer the neural imple-\nmentation of computation. We assume slow continuous dynamics such that the sampled nonlinear\ntrajectory is locally linear, thus, allowing us to propose a \ufb02exible nonlinear time series model that\ndirectly learns the velocity \ufb01eld. Our particular parameterization yields to better interpretations:\nidentifying \ufb01xed points and ghost points are easy, and so is the linearization of the dynamics around\nthose points for stability and manifold analyses. We further parameterize the velocity \ufb01eld using a\n\ufb01nite number of basis functions, in addition to a global contractional component. These features en-\ncourage the model to focus on interpolating dynamics within the support of the training trajectories.\n\n2 Model\n\nConsider a general d-dimensional continuous nonlinear dynamical system driven by external input,\n(1)\n\n\u02d9x = F (x, u)\n\n30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.\n\n\fwhere x \u2208 Rd represent the dynamic trajectory, and F : Rd \u00d7 Rdi \u2192 Rd fully de\ufb01nes the dynamics\nin the presence of input drive u \u2208 Rdi. We aim to learn the essential part of the dynamics F from a\ncollection of trajectories sampled at frequency 1/\u2206.\nOur work builds on extensive literature in nonlinear time series modeling. Assuming a separable, lin-\near input interaction, F (x, u) = Fx(x)+Fu(x)u, a natural nonlinear extension of an autoregressive\nmodel is to use a locally linear expansion of (1) [7, 9]:\n\nxt+1 = xt + A(xt)xt + b(xt) + B(xt)ut + \u03f5t\n\n(2)\nwhere b(x) = Fx(x)\u2206, A(x) : Rd \u2192 Rd\u00d7d is the Jacobian matrix of Fx at x scaled by time step\n\u2206, B(x) : Rd \u2192 Rd\u00d7di is the linearization of Fu around x, and \u03f5t denotes model mismatch noise of\norder O(\u22062). For example, {A, B} are parametrized with a radial basis function (RBF) network in\nthe multivariate RBF-ARX model of [10, 7], and {A, b, B} are parametrized with sigmoid neural\nnetworks in [9]. Note that A(\u00b7) is not guaranteed to be the Jacobian of the dynamical system (1)\nsince A and b also change with x. In fact, the functional form for A(\u00b7) is not unique, and a powerful\nfunction approximator for b(\u00b7) makes A(\u00b7) redundant and over parameterizes the dynamics.\nNote that (2) is a subclass of a general nonlinear model:\n\nxt+1 = f (xt) + B(xt)ut + \u03f5t,\n\n(3)\nwhere f , B are the discrete time solution of Fx, Fu. This form is widely used, and called nonlinear\nautoregressive with eXogenous inputs (NARX) model where f assumes various function forms (e.g.\nneural network, RBF network [11], or Volterra series [8]).\nWe propose to use a speci\ufb01c parameterization,\n\nxt+1 = xt + g(xt) + B(xt)ut + \u03f5t\ng(xt) = Wg\u03c6(xt) \u2212 e\u2212\u03c4 2\n\nxt\n\n(4)\n\nvec(B(xt)) = WB\u03c6(xt)\nwhere \u03c6(\u00b7) is a vector of r continuous basis functions,\n\n\u03c6(\u00b7) = (\u03c61(\u00b7), . . . ,\u03c6 r(\u00b7))\u22a4.\n\n(5)\nNote the inclusion of a global leak towards the origin whose rate is controlled by \u03c4 2. The further\naway from the origin (and as \u03c4 \u2192 0), the larger the effect of the global contraction. This encodes our\nprior knowledge that the neural dynamics are limited to a \ufb01nite volume of phase space, and prevents\nsolutions with nonsensical runaway trajectories.\nThe function g(x) directly represents the velocity \ufb01eld of an underlying smooth dynamics (1), unlike\nf (x) in (3) which can have convoluted jumps. We can even run the dynamics backwards in time,\nsince the time evolution for small \u2206 is reversible (by taking g(xt) \u2248 g(xt+1)), which is not possible\nfor (3), since f (x) is not necessarily an invertible function.\nFixed points x\u2217 satisfy g(x\u2217) + B(x\u2217)u = 0 for a constant input u. Far away from the \ufb01xed points,\ndynamics are locally just a \ufb02ow (recti\ufb01cation theorem) and largely uninteresting. The Jacobian in\nthe absence of input, J = \u2202g(x)\n\u2202x provides linearization of the dynamics around the \ufb01xed points (via\nthe Hartman-Grobman theorem), and the corresponding \ufb01xed point is stable if all eigenvalues of J\nare negative.\nWe can further identify \ufb01xed points, and ghost points (resulting from disappearance of \ufb01xed points\ndue to bifurcation) from local minima of \u2225g\u2225 with small magnitude. The \ufb02ow around the ghost\npoints can be extremely slow [4], and can exhibit signatures of computation through meta-stable\ndynamics [12]. Continuous attractors (such as limit cycles) are also important features of neural dy-\nnamics which exhibit spontaneous oscillatory modes. We can easily identify attractors by simulating\nthe model.\n\n3 Estimation\n\nWe de\ufb01ne the mean squared error as the loss function\n\nL(Wg, WB, c1...r,\u03c3 1...r) =\n\n1\nT\n\nT\u22121!t=0\n\n\u2225g(xt) + B(xt)ut + xt \u2212 xt+1\u22252\n2,\n\n(6)\n\n2\n\n\fwhere we use normalized squared exponential radial basis functions\n\n2\n\n2\u03c32\n\nexp\"\u2212\u2225z\u2212ci\u22252\ni #\ni=1 exp\"\u2212\u2225z\u2212ci\u22252\ni # ,\n\n2\n\n2\u03c32\n\n\u03c6i(z) =\n\n\u03f5 +$r\n\n(7)\n\nwith centers ci and corresponding kernel width \u03c3i. The small constant \u03f5 = 10\u22127 is to avoid numeri-\ncal 0 in the denominator.\nWe estimate the parameters {Wg, WB,\u03c4, c, \u03c3} by minimizing the loss function through gradient\ndescent (Adam [13]) implemented within TensorFlow [14]. We initialize the matrices Wg and\nWB by truncated standard normal distribution, the centers {ci} by the centroids of the K-means\nclustering on the training set, and the kernel width \u03c3 by the average euclidean distance between the\ncenters.\n\n4\n\nInferring Theoretical Models of Neural Computation\n\nWe apply the proposed method to a variety of low-dimensional neural models in theoretical neuro-\nscience. Each theoretical model is chosen to represent a different mode of computation.\n\n4.1 Fixed point attractor and bifurcation for binary decision-making\nPerceptual decision-making and working memory tasks are widely used behavioral tasks where the\ntasks typically involve a low-dimensional decision variable, and subjects are close to optimal in their\nperformance. To understand how the brain implements such neural computation, many competing\ntheories have been proposed [15, 16, 17, 18, 19, 20, 21]. We implemented the two dimensional\ndynamical system from [20] where the \ufb01nal decision is represented by two stable \ufb01xed points corre-\nsponding to each choice. The stimulus strength (coherence) nonlinearly interacts with the dynamics\n(see appendix for details), and biases the choice by increasing the basin of attraction (Fig. 1). We\nencode the stimulus strength as a single variable held constant throughout each trajectory as in [20].\nThe model with 10 basis functions learned the dynamics from 90 training trajectories (30 per coher-\nence c = 0, 0.5,\u22120.5). We visualize the log-speed as colored contours, and the direction component\nof the velocity \ufb01eld as arrows in Fig. 1. The \ufb01xed/ghost points are shown as red dots, which ideally\nshould be at the crossing of the model nullclines given by solid lines. For each coherence, two novel\nstarting points were simulated from the true model and the estimated model in Fig. 1. Although the\nmodel was trained with only low or moderate coherence levels where there are 2 stable and 1 unsta-\nble \ufb01xed points, it predicts bifurcation at higher coherence and it identi\ufb01es the ghost point (lower\nright panel).\nWe compare the model (4) to the following \u201clocally linear\u201d (LL) model,\n\nxt+1 =A(xt)xt + B(xt)ut + xt\n\nvec(A(xt)) =WA\u03c6(xt)\nvec(B(xt)) =WB\u03c6(xt)\n\n(8)\n\nin terms of training and prediction errors in Table 1. Note that there is no contractional term. We\ntrain both models on the same trajectories described above. Then we simulate 30 trajectories from\nthe true system and trained models for coherence c = 1 with the same random initial states within the\nunit square and calculate the mean squared error between the true trajectories and model-simulated\nones as prediction error. The other parameters are set to the same value as training. The LL model\n\nTable 1: Model errors\n\nModel Training error\n\n(4)\n(8)\n\n4.06E-08\n2.04E-08\n\nPrediction error: mean (std)\n0.002 (0.008)\n0.244 (0.816)\n\nhas poor prediction on the test set. This is due to unbounded \ufb02ow out of the phase space where the\ntraining data lies (see Fig. 6 in the supplement).\n\n3\n\n\fFigure 1: Wong and Wang\u2019s 2D dynamics model for perceptual decision-making [20]. We train the\nmodel with 90 trajectories (uniformly random initial points within the unit square, 0.5 s duration,\n1 ms time step) with different input coherence levels c = {0, 0.5,\u22120.5} (30 trajectories per coher-\nence). The yellow and green lines are the true nullclines. The black arrows represent the true velocity\n\ufb01elds (direction only) and the red arrows are model-predicted ones. The black and gray circles are\nthe true stable and unstable \ufb01xed points, while the red ones are local minima of model-prediction\n(includes \ufb01xed points and slow points). The background contours are model-predicted log\u2225 d s\nd t\u22252.\nWe simulated two 1 s trajectories each for true and learned model dynamics. The trajectories start\nfrom the cyan circles. The blue lines are from the true model and the cyan ones are simulated from\ntrained models. Note that we do not train our model on trajectories from the bottom right condition\n(c = 1).\n\n4\n\n\f(a)\n\n(c)\n\n(b)\n\n(d)\n\nFigure 2: FitzHugh-Nagumo model. (a) Direction (black arrow) and log-speed (contour) of true ve-\nlocity \ufb01eld. Two blue trajectories starting at the blue circles are simulated from the true system. The\nyellow and green lines are nullclines of v and w. The diamond is a spiral point. (b) 2-dimensional\nembedding of v model-predicted velocity \ufb01eld (red arrow and background contour). The black ar-\nrows are true velocity \ufb01eld. There are a few model-predicted slow points in light red. The blue\nlines are the same trajectories as the ones in (a). The cyan ones are simulated from trained model\nwithe the same initial states of the blue ones. (c) 100-step prediction every 100 steps using a test\ntrajectory generated with the same setting as training. (d) 200-step prediction every 200 steps using\na test trajectory driven by sinusoid input with 0.5 standard deviation white Gaussian noise.\n\n4.2 Nonlinear oscillator model\n\nOne of the most successful application of dynamical systems in neuroscience is in the biophysical\nmodel of a single neuron. We study the FitzHugh-Nagumo (FHN) model which is a 2-dimensional\nreduction of the Hodgkin-Huxley model [3]:\n\nv3\n3 \u2212 w + I,\n\n\u02d9v = v \u2212\n\u02d9w = 0.08(v + 0.7 \u2212 0.8w),\n\n(9)\n(10)\n\nwhere v is the membrane potential, w is a recovery variable and I is the magnitude of stimulus\ncurrent. The FHN has been used to model the up-down states observed in the neural time series of\nanesthetized auditory cortex [22].\nWe train the model with 50 basis functions on 100 simulated trajectories with uniformly random\ninitial states within the unit square [0, 1] \u00d7 [0, 1] and driven by injected current generated from a 0.3\nmean and 0.2 standard deviation white Gaussian noise. The duration is 200 and the time step is 0.1.\n\n5\n\n\f(a)\n\n(b)\n\nFigure 3: (a) Velocity \ufb01eld (true: black arrows, model-predicted: red arrows) for both direction and\nlog-speed; model-predicted \ufb01xed points (red circles, solid: stable, transparent: unstable). (b) One\ntrajectory from the true model (x, y), and one trajectory from the \ufb01tted model (\u02c6x, \u02c6y). The trajectory\nremains on the circle for both. Both are driven by the same input, and starts at same initial state.\n\nIn electrophysiological experiments, we only have access to v(t), and do not observe the slow re-\ncovery variable w. Delay embedding allows reconstruction of the phase space under mild condi-\ntions [23]. We build a 2D model by embedding v(t) as (v(t), v(t \u2212 10)), and \ufb01t the dynamical\nmodel (Fig. 2b). The phase space is distorted, but the overall prediction of the model is good given\na \ufb01xed current (Fig. 2b). Furthermore, the temporal simulation of v(t) for white noise injection\nshows reliable long-term prediction (Fig. 2c). We also test the model in a regime far from the train-\ning trajectories, and the dynamics does not diverge away from reasonable region of the phase space\n(Fig. 2d).\n\n4.3 Ring attractor dynamics for head direction network\n\nContinuous attractors such as line and ring attractors are often used as models for neural represen-\ntation of continuous variables [17, 4]. For example, the head direction neurons are tuned for the\nangle of the animal\u2019s head direction, and a bump attractor network with ring topology is proposed\nas the dynamics underlying the persistently active set of neurons [24]. Here we use the following 2\nvariable reduction of the ring attractor system:\n\n\u03c4r \u02d9r = r0 \u2212 r,\n\u03c4\u03b8 \u02d9\u03b8 = I(t),\n\n(11)\n(12)\n\nwhere \u03b8 represents the head direction driven by input I(t), and r is the radial component representing\nthe overall activity in the bump. The computational role of this ring attractor is to be insensitive to the\nnoise in the r direction, while integrating the differential input in the \u03b8 direction. In the absence of\ninput, the head direction \u03b8 does a random walk around the ring attractor. The ring attractor consists\nof a continuum of stable \ufb01xed points with a center manifold.\nWe train the model with 50 basis functions on 150 trajectories. The duration is 5 and the time step\nis 0.01. The parameters are set as r0 = 2, \u03c4r = 1 and \u03c4\u03b8 = 1. The initial states are uniformly\nrandom within (x, y) \u2208 [\u22123, 3] \u00d7 [\u22123, 3]. The inputs are constant angles evenly spaced in [\u2212\u03c0, \u03c0 ]\nwith Gaussian noises (\u00b5 = 0,\u03c3 = 5) added (see Fig. 7 in online supplement).\nFrom the trained model, we can identify a number of \ufb01xed points arranged around the ring attractor\n(Fig. 3a). The true ring dynamics model has one negative eigenvalue, and one zero-eigenvalue in the\nJacobian. Most of the model-predicted \ufb01xed points are stable (two negative real parts of eigenvalues)\nand the rest are unstable (two positive real parts of eigenvalues).\n\n6\n\n\fFigure 4: (a) Vector plot of 1-step-ahead prediction on one Lorenz trajectory (test). (b) 50-step\nprediction every 50 steps on one Lorenz trajectory (test). (c) A 200-step window of (b) (100-300).\nThe dashed lines are the true trajectory, the solid lines are the prediction and the circles are the start\npoints of prediction.\n\n4.4 Chaotic dynamics\nChaotic dynamics (or near chaos) has been postulated to support asynchronous states in the cor-\ntex [1], and neural computation over time by generating rich temporal patterns [2, 25]. We consider\nthe 3D Lorenz attractor as an example chaotic system. We simulate 20 trajectories from,\n\n\u02d9x = 10(y \u2212 x),\n\u02d9y = x(28 \u2212 z) \u2212 y,\n\u02d9z = xy \u2212\n\n8\n3\n\nz.\n\n(13)\n\nThe initial state of each trajectory is standard normal. The duration is 200 and the time step is 0.04.\nThe \ufb01rst 300 transient states of each trajectory are discarded. We use 19 trajectories for training and\nthe last one for testing. We train a model with 10 basis functions. Figure 4a shows the direction\nof prediction. The vectors represented by the arrows start from current states and point at the next\nfuture state. The predicted vectors (red) overlap the true vectors (blue) implying the one-step-ahead\npredictions are close to the true values in both speed and direction. Panel (b) gives an overview that\nthe prediction resembles the true trajectory. Panel (c) shows that the prediction is close to the true\nvalue up to 200 steps.\n\n5 Learning V1 neural dynamics\n\nTo test the model on data obtained from cortex, we use a set of trajectories obtained from the\nvariational Gaussian latent process (vLGP) model [26]. The latent trajectory model infers a 5-\ndimensional trajectory that describes a large scale V1 population recording (see [26] for details). The\nrecording was from an anesthetized monkey where 72 different equally spaced directional drifting\ngratings were presented for 50 trials each. We used 63 well tuned neurons out of 148 simultaneously\nrecorded single units. Each trial lasts for 2.56 s and the stimulus was presented only during the \ufb01rst\nhalf.\nWe train our model with 50 basis functions on the trial-averaged trajectories for 71 directions, and\nuse 1 direction for testing. The input was 3 dimensional: two boxcars indicating the stimulus direc-\ntion (sin \u03b8, cos \u03b8), and one corresponding to a low-pass \ufb01ltered stimulus onset indicator. Figure 5\nshows the prediction of the best linear dynamical system (LDS) for the 71 directions, and the non-\nlinear prediction from our model. LDS is given as xt+1 = Axt + But + xt with parameters A and\nB found by least squares. Although the LDS is widely used for smoothing the latent trajectories, it\nclearly is not a good predictor for the nonlinear trajectory of V1 (Fig. 5a). In comparison, our model\ndoes a better job at capturing the oscillations much better, however, it fails to capture the \ufb01ne details\nof the oscillation and the stimulus-off period dynamics.\n\n7\n\n\f(a) LDS prediction\n\n(b) Proposed model prediction\n\nFigure 5: V1 latent dynamics prediction. Models trained on 71 average trajectories for each direc-\ntional motion are tested on the 1 unseen direction. We divide the average trajectory at 0\u25e6 into 200\nms segments and predict each whole segment from the starting point of the segment. Note the poor\npredictive performance of linear dynamical system (LDS) model.\n\n6 Discussion\n\nTo connect dynamical theories of neural computation with neural time series data, we need to be\nable to \ufb01t an expressive model to the data that robustly predicts well. The model then needs to\nbe interpretable such that signatures of neural computation from the theories can be identi\ufb01ed by\nits qualitative features. We show that our method successfully learns low-dimensional dynamics in\ncontrast to \ufb01tting a high-dimensional recurrent neural network models in previous approaches [17,\n4, 25]. We demonstrated that our proposed model works well for well known dynamical models\nof neural computation with various features: chaotic attractor, \ufb01xed point dynamics, bifurcation,\nline/ring attractor, and a nonlinear oscillator. In addition, we also showed that it can model nonlinear\nlatent trajectories extracted from high-dimensional neural time series.\nCritically, we assumed that the dynamics consists of a continuous and slow \ufb02ow. This allowed us\nto parameterize the velocity \ufb01eld directly, reducing the complexity of the nonlinear function approx-\nimation, and making it easy to identify the \ufb01xed/slow points. An additional structural assumption\nwas the existence of a global contractional dynamics. This regularizes and encourages the dynamics\nto occupy a \ufb01nite phase volume around the origin.\nPrevious strategies of visualizing arbitrary trajectories from a nonlinear system such as recurrence\nplots were often dif\ufb01cult to understand. We visualized the dynamics using the velocity \ufb01eld decom-\nposed into speed and direction, and overlaid \ufb01xed/slow points found numerically as local minima\nof the speed. This is obviously more dif\ufb01cult for higher-dimensional dynamics, and dimensionality\nreduction and visualization that preserves essential dynamic features are left for future directions.\nThe current method is a two-step procedure for analyzing neural dynamics: \ufb01rst infer the latent\ntrajectories, and then infer the dynamic laws. This is clearly not an inef\ufb01cient inference, and the next\nstep would be to combine vLGP observation model and inference algorithm with the interpretable\ndynamic model and develop a uni\ufb01ed inference system.\nIn summary, we present a novel complementary approach to studying the neural dynamics of neural\ncomputation. Applications of the proposed method are not limited to neuroscience, but should\nbe useful for studying other slow low-dimensional nonlinear dynamical systems from observa-\ntions [27].\n\nAcknowledgment\n\nWe thank the reviewers for their constructive feedback. This work was partially supported by the\nThomas Hartman Foundation for Parkinson\u2019s Research.\n\n8\n\n\fReferences\n[1] D. Hansel and H. Sompolinsky. Synchronization and computation in a chaotic neural network. Physical\n\nReview Letters, 68(5):718\u2013721, Feb 1992.\n\n[2] W. Maass, T. Natschl\u00e4ger, and H. Markram. Real-time computing without stable states: A new framework\n\nfor neural computation based on perturbations. Neural Computation, 14:2531\u20132560, 2002.\n\n[3] E. M. Izhikevich. Dynamical systems in neuroscience : the geometry of excitability and bursting. Com-\n\nputational neuroscience. MIT Press, 2007.\n\n[4] D. Sussillo and O. Barak. Opening the black box: Low-Dimensional dynamics in High-Dimensional\n\nrecurrent neural networks. Neural Computation, 25(3):626\u2013649, December 2012.\n\n[5] L. Paninski, Y. Ahmadian, D. G. G. Ferreira, et al. A new look at state-space models for neural data.\n\nJournal of computational neuroscience, 29(1-2):107\u2013126, August 2010.\n\n[6] J. P. Cunningham and B. M. Yu. Dimensionality reduction for large-scale neural recordings. Nat Neurosci,\n\n17(11):1500\u20131509, November 2014.\n\n[7] T. Ozaki. Time Series Modeling of Neuroscience Data. CRC Press, January 2012.\n[8] S. Eikenberry and V. Marmarelis. A nonlinear autoregressive volterra model of the HodgkinHuxley equa-\n\ntions. Journal of Computational Neuroscience, 34(1):163\u2013183, August 2013.\n\n[9] M. Watter, J. Springenberg, J. Boedecker, and M. Riedmiller. Embed to control: A locally linear latent\ndynamics model for control from raw images. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama,\nand R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 2746\u20132754. Curran\nAssociates, Inc., 2015.\n\n[10] M. Gan, H. Peng, X. Peng, X. Chen, and G. Inoussa. A locally linear RBF network-based state-dependent\nInformation Sciences, 180(22):4370\u20134383, November\n\nAR model for nonlinear time series modeling.\n2010.\n\n[11] S. Chen, S. A. Billings, C. F. N. Cowan, and P. M. Grant. Practical identi\ufb01cation of NARMAX models\n\nusing radial basis functions. International Journal of Control, 52(6):1327\u20131350, December 1990.\n\n[12] M. I. Rabinovich, R. Huerta, P. Varona, and V. S. Afraimovich. Transient cognitive dynamics, metastabil-\n\nity, and decision making. PLoS Computational Biology, 4(5):e1000072+, May 2008.\n\n[13] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.\n[14] M. Abadi, A. Agarwal, P. Barham, et al. TensorFlow: Large-scale machine learning on heterogeneous\n\nsystems, 2015. Software available from tensor\ufb02ow.org.\n\n[15] O. Barak, D. Sussillo, R. Romo, M. Tsodyks, and L. F. Abbott. From \ufb01xed points to chaos: three models\n\nof delayed discrimination. Progress in neurobiology, 103:214\u2013222, April 2013.\n\n[16] C. K. Machens, R. Romo, and C. D. Brody. Flexible control of mutual inhibition: A neural model of\n\nTwo-Interval discrimination. Science, 307(5712):1121\u20131124, February 2005.\n\n[17] V. Mante, D. Sussillo, K. V. Shenoy, and W. T. Newsome. Context-dependent computation by recurrent\n\ndynamics in prefrontal cortex. Nature, 503(7474):78\u201384, November 2013.\n\n[18] S. Ganguli, J. W. Bisley, J. D. Roitman, et al. One-dimensional dynamics of attention and decision making\n\nin LIP. Neuron, 58(1):15\u201325, April 2008.\n\n[19] M. E. Mazurek, J. D. Roitman, J. Ditterich, and M. N. Shadlen. A role for neural integrators in perceptual\n\ndecision making. Cerebral Cortex, 13(11):1257\u20131269, November 2003.\n\n[20] K.-F. Wong and X.-J. Wang. A recurrent network mechanism of time integration in perceptual decisions.\n\nThe Journal of Neuroscience, 26(4):1314\u20131328, January 2006.\n\n[21] M. S. Goldman. Memory without feedback in a neural network. Neuron, 61(4):621\u2013634, February 2009.\n[22] C. Curto, S. Sakata, S. Marguet, V. Itskov, and K. D. Harris. A simple model of cortical dynamics\nexplains variability and state dependence of sensory responses in Urethane-Anesthetized auditory cortex.\nThe Journal of Neuroscience, 29(34):10600\u201310612, August 2009.\n\n[23] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. Cambridge University Press, 2003.\n[24] A. Peyrache, M. M. Lacroix, P. C. Petersen, and G. Buzsaki. Internally organized mechanisms of the head\n\ndirection sense. Nature Neuroscience, 18(4):569\u2013575, March 2015.\n\n[25] R. Laje and D. V. Buonomano. Robust timing and motor patterns by taming chaos in recurrent neural\n\nnetworks. Nat Neurosci, 16(7):925\u2013933, July 2013.\n\n[26] Y. Zhao and I. M. Park. Variational latent Gaussian process for recovering single-trial dynamics from\n\npopulation spike trains. ArXiv e-prints, April 2016.\n\n[27] B. C. Daniels and I. Nemenman. Automated adaptive inference of phenomenological dynamical models.\n\nNature Communications, 6:8133+, August 2015.\n\n9\n\n\f", "award": [], "sourceid": 1662, "authors": [{"given_name": "Yuan", "family_name": "Zhao", "institution": "Stony Brook University"}, {"given_name": "Il Memming", "family_name": "Park", "institution": "Stony Brook University"}]}