{"title": "Optimal Neural Codes for Control and Estimation", "book": "Advances in Neural Information Processing Systems", "page_first": 2987, "page_last": 2995, "abstract": "Agents acting in the natural world aim at selecting appropriate actions based on noisy and partial sensory observations. Many behaviors leading to decision making and action selection in a closed loop setting are naturally phrased within a control theoretic framework. Within the framework of optimal Control Theory, one is usually given a cost function which is minimized by selecting a control law based on the observations. While in standard control settings the sensors are assumed fixed, biological systems often gain from the extra flexibility of optimizing the sensors themselves. However, this sensory adaptation is geared towards control rather than perception, as is often assumed. In this work we show that sensory adaptation for control differs from sensory adaptation for perception, even for simple control setups. This implies, consistently with recent experimental results, that when studying sensory adaptation, it is essential to account for the task being performed.", "full_text": "Optimal Neural Codes for Control and Estimation\n\nAlex Susemihl1, Manfred Opper\nMethods of Arti\ufb01cial Intelligence\n\nTechnische Universit\u00a8at Berlin\n1 Current af\ufb01liation: Google\n\nRon Meir\n\nDepartment of Electrical Engineering\n\nTechnion - Haifa\n\nAbstract\n\nAgents acting in the natural world aim at selecting appropriate actions based on\nnoisy and partial sensory observations. Many behaviors leading to decision mak-\ning and action selection in a closed loop setting are naturally phrased within a\ncontrol theoretic framework. Within the framework of optimal Control Theory,\none is usually given a cost function which is minimized by selecting a control\nlaw based on the observations. While in standard control settings the sensors are\nassumed \ufb01xed, biological systems often gain from the extra \ufb02exibility of optimiz-\ning the sensors themselves. However, this sensory adaptation is geared towards\ncontrol rather than perception, as is often assumed. In this work we show that sen-\nsory adaptation for control differs from sensory adaptation for perception, even for\nsimple control setups. This implies, consistently with recent experimental results,\nthat when studying sensory adaptation, it is essential to account for the task being\nperformed.\n\n1\n\nIntroduction\n\nBiological systems face the dif\ufb01cult task of devising effective control strategies based on partial in-\nformation communicated between sensors and actuators across multiple distributed networks. While\nthe theory of Optimal Control (OC) has become widely used as a framework for studying motor con-\ntrol, the standard framework of OC neglects many essential attributes of biological control [1, 2, 3].\nThe classic formulation of closed loop OC considers a dynamical system (plant) observed through\nsensors which transmit their output to a controller, which in turn selects a control law that drives\nactuators to steer the plant. This standard view, however, ignores the fact that sensors, controllers\nand actuators are often distributed across multiple sub-systems, and disregards the communication\nchannels between these sub-systems. While the importance of jointly considering control and com-\nmunication within a uni\ufb01ed framework was already clear to the pioneers of the \ufb01eld of Cybernetics\n(e.g., Wiener and Ashby), it is only in recent years that increasing effort is being devoted to the\nformulation of a rigorous systems-theoretic framework for control and communication (e.g., [4]).\nSince the ultimate objective of an agent is to select appropriate actions, it is clear that sensation and\ncommunication must subserve effective control, and should be gauged by their contribution to action\nselection. In fact, given the communication constraints that plague biological systems (and many\ncurrent distributed systems, e.g., cellular networks, sensor arrays, power grids, etc.), a major concern\nof a control design is the optimization of sensory information gathering and communication (consis-\ntently with theories of active perception). For example, recent theoretical work demonstrated a sharp\ncommunication bandwidth threshold below which control (or even stabilization) cannot be achieved\n(for a summary of such results see [4]). Moreover, when informational constraints exists within a\ncontrol setting, even simple (linear and Gaussian) problems become nonlinear and intractable, as\nexempli\ufb01ed in the famous Witsenhausen counter-example [5].\nThe inter-dependence between sensation, communication and control is often overlooked both in\ncontrol theory and in computational neuroscience, where one assumes that the overall solution to\nthe control problem consists of \ufb01rst estimating the state of the controlled system (without reference\n\n1\n\n\fto the control task), followed by constructing a controller based on the estimated state. This idea,\nreferred to as the separation principle in Control Theory, while optimal in certain restricted settings\n(e.g., Linear Quadratic Gaussian (LQG) control) is, in general, sub-optimal [6]. Unfortunately, it is\nin general very dif\ufb01cult to provide optimal solutions in cases where separation fails. A special case\nof the separation principle, referred to as Certainty Equivalence (CE), occurs when the controller\ntreats the estimated state as the true state, and forms a controller assuming full state information. It\nis generally overlooked, however, that although the optimal control policy does not depend directly\non the observation model at hand, the expected future costs do depend on the speci\ufb01cs of that model\n[7]. In this sense, even when CE holds, costs still arise from uncertain estimates of the state and one\ncan optimise the sensory observation model to minimise these costs, leading to sensory adaptation.\nAt \ufb01rst glance, it might seem that the observation model that will minimise the expected future cost\nwill be the observation model that minimises the estimation error. We will show, however, that this\nis not generally the case.\nA great deal of the work in computational neuroscience has dealt independently with the problem\nof sensory adaptation and control, while, as stated above, these two issues are part and parcel of\nthe same problem. In fact, it is becoming increasingly clear that biological sensory adaptation is\ntask-dependent [8, 9]. For example, [9] demonstrates that task-dependent sensory adaptation takes\nplace in purely motor tasks, explaining after-effect phenomena seen in experiments. In [10], the\nauthors show that speci\ufb01c changes occur in sensory regions, implying sensory plasticity in motor\nlearning. In this work we consider a simple setting for control based on spike time sensory coding,\nand study the optimal coding of sensory information required in order to perform a well-de\ufb01ned\nmotor task. We show that even if CE holds, the optimal encoder strategy, minimising the control cost,\ndiffers from the optimal encoder required for state estimation. This result demonstrates, consistently\nwith experiments, that neural encoding must be tailored to the task at hand. In other words, when\nanalyzing sensory neural data, one must pay careful care to the task being performed. Interestingly,\nwork within the distributed control community dealing with optimal assignment and selection of\nsensors, leads to similar conclusions and to speci\ufb01c schemes for sensory adaptation.\nThe interplay between information theory and optimal control is a central pillar of modern control\ntheory, and we believe it must be accounted for in the computational neuroscience community.\nThough statistical estimation theory has become central in neural coding issues, often through the\nCram\u00b4er-Rao bound, there have been few studies bridging the gap between partially observed control\nand neural coding. We hope to narrow this gap by presenting a simple example where control\nand estimation yield different conclusions. The remainder of the paper is organised as follows:\nIn section 1.1 we introduce the notation and concepts; In section 2 we derive expressions for the\ncost-to-go of a linear-quadratic control system observed through spikes from a dense populations of\nneurons; in section 3 we present the results and compare optimal codes for control and estimation\nwith point-process \ufb01ltering, Kalman \ufb01ltering and LQG control; in section 4 we discuss the results\nand their implications.\n\n1.1 Optimal Codes for Estimation and Control\n\nWe will deal throughout this paper with a dynamic system with state Xt, observed through noisy\nsensory observations Zt, whose conditional distribution can be parametrised by a set of parameters\n\u03d5, e.g., the widths and locations of the tuning curves of a population of neurons or the noise prop-\nerties of the observation process. The conditional distribution is then given by P\u03d5(Zt|Xt = x).\nZt could stand for a diffusion process dependent on Xt (denoted Yt) or a set of doubly-stochastic\nPoisson processes dependent on Xt (denoted N m\nt ). In that sense, the optimal Bayesian encoder for\nan estimation problem, based on the Mean Squared Error (MSE) criterion, can be written as\n\n(cid:20)\n\n(cid:20)(cid:16)\n\n\u03d5\u2217\ne = argmin\n\n\u03d5\n\nEz\n\nEXt\n\nXt \u2212 \u02c6Xt(Zt)\n\n(cid:21)(cid:21)\n\n(cid:17)2(cid:12)(cid:12)(cid:12)(cid:12)Zt = z\n\n,\n\nwhere \u02c6Xt(Zt) = E [Xt|Zt] is the posterior mean, computable, in the linear Gaussian case, by the\nKalman \ufb01lter. We will throughout this paper consider the MMSE in the equilibrium, that is, the\nerror in estimating Xt from long sequences of observations Z[0,t]. Similarly, considering a control\nproblem with a cost given by\n\nC(X 0, U 0) =\n\nc(Xs, Us, s)ds + cT (XT ),\n\n(cid:90) T\n\n0\n\n2\n\n\fwhere X t = {Xs|s \u2208 [t, T ]}, U t = {Us|s \u2208 [t, T ]}, and so forth. We can de\ufb01ne\n\n\u03d5\u2217\nc = argmin\n\n\u03d5\n\nEz min\nU t\n\n[EX t [C(X 0, U 0)|Zt = z]] .\n\nThe certainty equivalence principle states that given a control policy \u03b3\u2217 : X \u2192 U which minimises\nthe cost C,\n\n\u03b3\u2217 = argmin\n\n\u03b3\n\nC(X 0, \u03b3(X 0)),\n\nthe optimal control policy for the partially observed problem given by noisy observations Z0 of X 0\nis given by\nNote that we have used the notation \u03b3(X 0) = {\u03b3(Xs), s \u2208 [0, T ]}.\n\n\u03b3CE(Zt) = \u03b3\u2217 (E [X 0|Zt]) .\n\n2 Stochastic Optimal Control\n\nIn stochastic optimal control we seek to minimize the expected future cost incurred by a system\nwith respect to a control variable applied to that system. We will consider linear stochastic systems\ngoverned by the SDE\n\ndXt = (AXt + BUt) dt + D1/2dWt,\n\n(cid:90) T\n\nt\n\n(cid:0)X(cid:62)\n\n(cid:1) ds + X(cid:62)\n\nT QT XT .\n\n(1a)\n\n(1b)\n\nwith a cost function C(X t, U t, t) =\n\ns QXs + U(cid:62)\n\ns RUs\n\nFrom Bellman\u2019s optimality principle or variational analysis [11], it is well known that the optimal\nt = \u2212R\u22121B(cid:62)StXt, where St is the solution of the Riccati equation \u2212 \u02d9St =\ncontrol is given by U\u2217\nQ + ASt + StA(cid:62) \u2212 StB(cid:62)R\u22121BSt, with boundary condition ST = QT . The expected future cost\nat time t and state x under the optimal control is then given by\n\nJ(x, t) = min\nU t\n\nE [C(X t, U t, t)|Xt = x] =\n\nx(cid:62)Stx +\n\n1\n2\n\nTr (DSs) ds.\n\n(cid:90) T\n\nt\n\nt = \u2212R\u22121B(cid:62)StE [Xt|Yt] [7].\n\nThis is usually called the optimal cost-to-go. However, the system\u2019s state is not always directly\naccessible and we are often left with noisy observations of it. For a class of systems e.g. LQG\ncontrol, CE holds and the optimal control policy for the indirectly observed control problem is\nsimply the optimal control policy for the original control problem applied to the Bayesian estimate of\nthe system\u2019s state. In that sense, if the CE were to hold for the system above observed through noisy\nobservations Yt of the state at time t, the optimal control would be given simply by the observation-\ndependent control U\u2217\nThough CE, when applicable, gives us a simple way to determine the optimal control, when con-\nsidering neural systems we are often interested in \ufb01nding the optimal encoder, or the optimal ob-\nservation model for a given system. That is equivalent to \ufb01nding the optimal tuning function for a\ngiven neuron model. Since CE neatly separates the estimation and control steps, it would be tempt-\ning to assume the optimal codes obtained for an estimation problem would also be optimal for an\nassociated control problem. We will show here that this is not the case.\nAs an illustration, let us consider the case of LQG with incomplete state information. One could,\nfor example, take the observations to be a secondary process Yt, which itself is a solution to\n\nthe optimal cost-to-go would then be given by [11]\n\nE(cid:2)C(X t, U t, t)(cid:12)(cid:12)Y[0,t] = y(cid:3)\n\nJ(y, t) = min\nU t\n\ndYt = F Xtdt + G1/2dVt,\n\n=\u03bd(cid:62)\n\nt St\u03bdt + Tr (KtSt) +\n\nTr (DSs) ds +\n\nt\n\n(cid:90) T\n\nTr(cid:0)SsBR\u22121B(cid:62)SsKs\n\n(cid:1) ds,\n\nwhere we have de\ufb01ned Y[0,t] = {Ys, s \u2208 [0, t]}, \u03bdt = E[Xt|Y[0,t]] and Kt = cov[Xt|Y[0,t]]. We\ngive a demonstration of these results in the SI, but for a thorough review see [11]. Note that through\n\n(2)\n\n(3)\n\n(cid:90) T\n\nt\n\n3\n\n\fthe last term in equation (3) the cost-to-go now depends on the parameters of the Yt process. More\nprecisely, the variance of the distribution of Xs given Yt, for s > t obeys the ODE\n\n\u02d9Kt = AKt + KtA(cid:62) + D \u2212 KtF (cid:62)G\u22121F Kt.\n\n(4)\nOne could then choose the matrices F and G in such a way as to minimise the contribution of\nthe rightmost term in equation (3). Note that in the LQG case this is not particularly interesting,\nas the conclusion is simply that we should strive to make Kt as small as possible, by making the\nterm F (cid:62)G\u22121F as large as possible. This translates to choosing an observation process with very\nstrong steering from the unobserved process (large F ) and a very small noise (small G). One case\nthat provides some more interesting situations is if we consider a two-dimensional system, where\nwe are restricted to a noise covariance with constant determinant. That means the hypervolume\nspanned by the eigenvectors of the covariance matrix is constant. We will compare this case with\nthe Poisson-coded case below.\n\n2.1 LQG Control with Dense Gauss-Poisson Codes\n\nLet us now consider the case of the system given by equation (1a), but instead of observing the\nsystem directly we observe a set of doubly-stochastic Poisson processes {N m\nt } with rates given by\n(5)\n\n(cid:104)\u2212 (x \u2212 \u03b8m)\n\nP \u2020 (x \u2212 \u03b8m) /2\n\n\u03bbm(x) = \u03c6 exp\n\n(cid:105)\n\n(cid:62)\n\n.\n\n\u02c6\u03bb =(cid:80)\n\nis a counting process which counts how many spikes the neuron m\nTo clarify, the process N m\nt\nt will give the\nhas \ufb01red up to time t. In that sense, the differential of the counting process dN m\nspike train process, a sum of Dirac delta functions placed at the times of spikes \ufb01red by neuron\nm. Here P \u2020 denotes the pseudo-inverse of P , which is used to allow for tuning functions that\ndo not depend on certain coordinates of the stimulus x. Furthermore, we will assume that the\ntuning centre \u03b8m are such that the probability of observing a spike of any neuron at a given time\nm \u03bbm(x) is independent of the speci\ufb01c value of the world state x. This can be a consequence\nof either a dense packing of the tuning centres \u03b8m along a given dimension of x, or of an absolute\ninsensitivity to that aspect of x through a null element in the diagonal of P \u2020. This is often called\nthe dense coding hypothesis [12]. It can be readily be shown that the \ufb01ltering distribution is given\nby P (Xt|{N[0,t)}) = N (\u00b5t, \u03a3t), where the mean and covariance are solutions to the stochastic\ndifferential equations (see [13])\n\n(cid:1)\u22121\n(6b)\n[0,t]}]. Note that we have also\nwhere we have de\ufb01ned \u00b5t = E[Xt|{N m\nde\ufb01ned N m\nt .\nm N m\nUsing Lemma 7.1 from [11] provides a simple connection between the cost function and the solution\nof the associated Ricatti equation for a stochastic process. We have\n\n(cid:1)\u22121\n(cid:0)I + P \u2020\u03a3t\nd\u03a3t =(cid:0)A\u03a3t + \u03a3tA(cid:62) + D(cid:1) dt \u2212 \u03a3tP \u2020\u03a3t\n(cid:0)I + P \u2020\u03a3t\ns up to time t, and Nt =(cid:80)\n(cid:3) ds\n\n[0,t]}] and \u03a3t = cov[Xt|{N m\n(cid:90) T\n(cid:90) T\n\ns |s \u2208 [0, t]}, the history of the process N m\n\nP \u2020 (\u03b8m \u2212 \u00b5t) dN m\nt ,\n\nC(X t, U t, t) = X(cid:62)\n\nd\u00b5t = (A\u00b5t + BUt) dt +\n\n[0,t] = {N m\n\nT QT XT +\n\n(cid:88)\n\ns RUs\n\ndNt,\n\n(6a)\n\n\u03a3t\n\nm\n\nt\n\n=X(cid:62)\n\nt StXt +\n\n(Us + R\u22121B(cid:62)SsXs)(cid:62)R(Us + R\u22121B(cid:62)SsXs)ds\n\n(cid:2)X(cid:62)\ns QXs + U(cid:62)\n(cid:90) T\n\n(cid:90) T\n\nt\n\n(cid:90) T\n(cid:34)(cid:90) T\n\nt\n\nt\n\n+\n\ndW (cid:62)\n\nTr(DSs)ds +\n\ns D(cid:62)/2SsXsds +\n\nX(cid:62)\ns SsD1/2dWs.\nWe can average over P (X t, N t|{N[0,t)}) to obtain the expected future cost. That gives us\n\u00b5(cid:62)\nt St\u00b5t+Tr(\u03a3tSt)+E\nWe can evaluate the average over P (X t,{N m\n(cid:20)(cid:90) T\nGaussian densities P (Xs|{N m\n[0,s]}) and then over P ({N[0,s]}|{N[0,t)}). The average gives\n(cid:62)\n\n(cid:90) T\n[0,t)}) in two steps, by \ufb01rst averaging over the\n(cid:21)\n(cid:104)\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12){N[0,t)}\n(cid:35)\n(cid:105)\n\n(Us + R\u22121B(cid:62)SsXs)(cid:62)R(Us + R\u22121B(cid:62)SsXs)ds\n\nt }|{N m\n\n+\n\n(cid:62)\n\n(cid:62)\n\n(cid:62)\n\nt\n\nt\n\nt\n\nSs\u00b5s)\n\nR(Us + R\n\nSs\u00b5s) + Tr\n\nSsBR\n\n\u22121B\n\nSs\u03a3s({N[0,s]})\n\nds\n\nE\n\n(Us + R\n\n\u22121B\n\n\u22121B\n\n(cid:12)(cid:12)(cid:12)(cid:12){N[0,t)}\n\n,\n\nTr(DSs)ds\n\nt\n\n4\n\n\fwhere \u00b5s and \u03a3s are the mean and variance associated with the distribution P (Xs|{N[0,s)}). Note\nthat choosing Us = \u2212R\u22121B(cid:62)Ss\u00b5s will minimise the expression above, consistently with CE. The\noptimal cost-to-go is therefore given by\n\nJ({N[0,t)}, t) =\u00b5(cid:62)\n\nt St\u00b5t + Tr(\u03a3tSt)\n\n(cid:90) T\n\n+\n\nTr (DSs) ds +\n\nt\n\nt\n\n(cid:90) T\n\nTr(cid:0)SsBR\u22121B(cid:62)SsE(cid:2)\u03a3s({N[0,s]})|{N[0,t)}(cid:3)(cid:1) ds\n\n(7)\n\nNote that the only term in the cost-to-go function that depends on the parameters of the encoders is\nthe rightmost term and it depends on it only through the average over future paths of the \ufb01ltering\nvariance \u03a3s. The average of the future covariance matrix is precisely the MMSE for the \ufb01ltering\nproblem conditioned on the belief state at time t [13]. We can therefore analyse the quality of an\nencoder for a control task by looking at the values of the term on the right for different encoding\nparameters. Furthermore, since the dynamics of \u03a3t given by equation (6b) is Markovian, we can\n\nwrite the average E(cid:2)\u03a3s|{N[0,t)}(cid:3) as E [\u03a3s|\u03a3t]. We will de\ufb01ne then the function f (\u03a3, t) which\n\ngives us the uncertainty-related expected future cost for the control problem as\n\n(cid:90) T\n\nTr(cid:0)SsBR\u22121B(cid:62)SsE [\u03a3s|\u03a3t = \u03a3](cid:1) ds.\n\n(8)\n\nf (\u03a3, t) =\n\n2.2 Mutual Information\n\nt\n\nMany results in information theory are formulated in terms of the mutual information of the com-\nmunication channel P\u03d5(Y |X). For example, the maximum cost reduction achievable with R bits of\ninformation about an unobserved variable X has been shown to be a function of the rate-distortion\nfunction with the cost as the distortion function [14]. More recently there has also been a lot of\ninterest in the so-called I-MMSE relations, which provide connections between the mutual infor-\nmation of a channel and the minimal mean squared error of the Bayes estimator derived from the\nsame channel [15, 16]. The mutual information for the cases we are considering is not particularly\ncomplex, as all distributions are Gaussians. Let us denote by \u03a30\nt the covariance of of the unobserved\nprocess Xt conditioned on some initial Gaussian distribution P0 = N (\u00b50, \u03a30) at time 0. We can\nthen consider the Mutual Information between the stimulus at time t, Xt, and the observations up to\ntime t, Y[0,t] or N[0,t]. For the LQG/Kalman case we have simply\n\nI(Xt; Y[0,t]|P0) =\n\ndx dyP (x, y) [log P (x|y) \u2212 log P (x)] = log |\u03a30\n\nt| \u2212 log |\u03a3t|,\n\n(cid:90)\n\nwhere \u03a3t is a solution of equation (4). For the Dense Gauss-Poisson code, we can also denoting the\nsolution to the stochastic differential equation (6b) for the given value N[0,t] by \u03a3t(N[0,t]) we have.\nI(Xt; Nt|P0) =\n\ndx dn P (x, n) [log P (x|n) \u2212 log P (x)] = log |\u03a30\n\n(cid:2)log |\u03a3t(N[0,t])|(cid:3) .\n\nt| \u2212 EN[0,t]\n\n(cid:90)\n\n3 Optimal Neural Codes for Estimation and Control\n\nWhat could be the reasons for an optimal code for an estimation problem to be sub-optimal for a\ncontrol problem? We present examples that show two possible reasons for different optimal coding\nstrategies in estimation and control. First, one should note that control problems are often de\ufb01ned\nover a \ufb01nite time horizon. One set of classical experiments involves reaching for a target under\ntime constraints [3].\nIf we take the maximal \ufb01ring rate of the neurons (\u03c6) to be constant while\nvarying the width of the tuning functions, this will lead the number of observed spikes to be inversely\nproportional to the precision of those spikes, forcing a trade-off between the number of observations\nand their quality. This trade-off can be tilted to either side in the case of control depending on the\ninformation available at the start of the problem. If we are given complete information on the system\nstate at the initial time 0, the encoder needs fewer spikes to reliably estimate the system\u2019s state\nthroughout the duration of the control experiment, and the optimal encoder will be tilted towards\na lower number of spikes with higher precision. Conversely, if at the beginning of the experiment\nwe have very little information about the system\u2019s state, re\ufb02ected in a very broad distribution, the\nencoder will be forced towards lower precision spikes with higher frequency. These results are\ndiscussed in section 3.1.\n\n5\n\n\fSecondly, one should note that the optimal encoder for estimation does not take into account the\ndifferential weighting of different dimensions of the system\u2019s state. When considering a multidi-\nmensional estimation problem, the optimal encoder will generally allocate all its resources equally\nbetween the dimensions of the system\u2019s state. In the framework presented we can think of the dimen-\nsions as the singular vectors of the tuning matrix P and the resources allocated to it are the singular\nvalues. In this sense, we will consider a set of coding strategies de\ufb01ned by matrices P of constant\ndeterminant in section 3.2. This constrains the overall \ufb01ring rate of the population of neurons to be\nconstant, and we can then consider how the population will best allocate its observations between\nthese dimensions. Clearly, if we have an anisotropic control problem, which places a higher impor-\ntance in controlling one dimension, the optimal encoder for the control problem will be expected to\nallocate more resources to that dimension. This is indeed shown to be the case for the Poisson codes\nconsidered, as well as for a simple LQG problem when we constrain the noise covariance to have\nthe same structure.\nWe do not mean our analysis to be exhaustive as to the factors leading to different optimal codes\nin estimation and control settings, as the general problem is intractable, and indeed, is not even\nseparable. We intend this to be a proof of concept showing two cases in which the analogy between\ncontrol and estimation breaks down.\n\n3.1 The Trade-off Between Precision and Frequency of Observations\n\n\u221a\n\nIn this section we consider populations of neurons with tuning functions as given by equation (5)\nwith tuning centers \u03b8m distributed along a one- dimensional line.\nIn the case of the Ornstein-\nUhlenbeck process these will be simply one-dimensional values \u03b8m whereas in the case of the\nstochastic oscillator, we will consider tuning centres of the form \u03b8m = (\u03b7m, 0)(cid:62), \ufb01lling only the\n\ufb01rst dimension of the stimulus space. Note that in both cases the (dense) population \ufb01ring rate\n2\u03c0p\u03c6/|\u2206\u03b8|, where \u2206\u03b8 is the separation between neigh-\n\n\u02c6\u03bb = (cid:80)\n\nm \u03bbm(x) will be given by \u02c6\u03bb =\n\nbouring tuning centres \u03b8m.\nThe Ornstein-Uhlenbeck (OU) process controlled by a process Ut is given by the SDE dXt =\n(bUt \u2212 \u03b3Xt)dt + D1/2dWt. Equation (7) can then be solved by simulating the dynamics of \u03a3s.\nThis has been considered extensively in [13] and we refer to the results therein. Speci\ufb01cally, it has\nbeen found that the dynamics of the average can be approximated in a mean-\ufb01eld approach yielding\nsurprisingly good results. The evolution of the average posterior variance is given by the average\nof equation (6b), which involves nonlinear averages over the covariances. These are intractable,\nbut a simple mean-\ufb01eld approach yields the approximate equation for the evolution of the average\n(cid:104)\u03a3s(cid:105) = E [\u03a3s|\u03a30]\nd(cid:104)\u03a3s(cid:105)\nds\n\nA(cid:62) + D \u2212 \u02c6\u03bb(cid:104)\u03a3s(cid:105) P \u2020 (cid:104)\u03a3s(cid:105)(cid:0)I + P \u2020 (cid:104)\u03a3s(cid:105)(cid:1)\u22121\n\n= A(cid:104)\u03a3s(cid:105) + (cid:104)\u03a3s(cid:105)(cid:62)\n\n.\n\nThe alternative is to simulate the stochastic dynamics of \u03a3t for a large number of samples and\ncompute numerical averages. These results can be directly employed to evaluate the optimal cost-\nto-go in the control problem f (\u03a3, t).\nAlternatively, we can look at a system with more complex dynamics, and we take as an example the\nstochastic damped harmonic oscillator given by the system of equations\n\ndVt =(cid:0)bUt \u2212 \u03b3Vt \u2212 \u03c92Xt\n\n(cid:1) dt + \u03b71/2dWt.\n\n\u02d9Xt = Vt,\n\n(9)\nFurthermore, we assume that the tuning functions only depend on the position of the oscillator,\ntherefore not giving us any information about the velocity. The controller in turn seeks to keep the\noscillator close to the origin while steering only the velocity. This can be achieved by the choice of\nmatrices A = (0, 1;\u2212\u03c92,\u2212\u03b3), B = (0, 0; 0, b), D = (0, 0; 0, \u03b72), R = (0, 0; 0, r), Q = (q, 0; 0, 0)\nand P = (p2, 0; 0, 0).\nIn \ufb01gure 1 we provide the uncertainty-dependent costs for LQG control, for the Poisson observed\ncontrol, as well as the MMSE for the Poisson \ufb01ltering problem and for a Kalman-Bucy \ufb01lter with the\nsame noise covariance matrix P . More precisely, we are considering a Kalman-Bucy \ufb01lter for the\nsame dynamic system, but observed through a difusion process as in equation (2) with F = I and\nG = P . This illustrates nicely the difference between Kalman \ufb01ltering and the Gauss-Poisson \ufb01l-\ntering considered here. The Kalman \ufb01lter MSE has a simple, monotonically increasing dependence\n\n6\n\n\fFigure 1: The trade-off between the precision and the frequency of spikes is illustrated for the OU process\n(left) and the stochastic oscillator (right). In both \ufb01gures, the initial condition has a very uncertain estimate of\nthe system\u2019s state, biasing the optimal tuning width towards higher values. This forces the encoder to amass the\nmaximum number of observations within the duration of the control experiment. Parameters for left \ufb01gure were:\nT = 2, \u03b3 = 1.0, \u03b7 = 0.6, b = 0.2, \u03c6 = 0.1, \u2206\u03b8 = 0.05, Q = 0.1, QT = 0.001, R = 0.1. Parameters for\nright \ufb01gure were T = 5, \u03b3 = 0.4, \u03c9 = 0.8, \u03b7 = 0.4, r = 0.4, q = 0.4, QT = 0, \u03c6 = 0.5, \u2206\u03b8 = 0.1.\n\non the noise covariance, and one should simply strive to design sensors with the highest possible\nprecision (p = 0) to minimise the MMSE and control costs. The Poisson case leads to optimal\nperformance at a non-zero value of p. Importantly the optimal values of p for estimation and control\ndiffer. Furthermore, in view of section 2.2, we also plotted the mutual information between the pro-\ncess Xt and the observation process Nt, to illustrate that information-based arguments would lead\nto the same optimal encoder as MMSE-based arguments.\n\n3.2 Allocating Observation Resources in Anisotropic Control Problems\n\nA second factor that could lead to different optimal encoders in estimation and control is the struc-\nture of the cost function C. Speci\ufb01cally, if the cost functions depends more strongly on a certain\ncoordinate of the system\u2019s state, uncertainty in that particular coordinate will have a higher impact\non expected future costs than uncertainty in other coordinates. We will here consider two simple\nlinear control systems observed by a population of neurons restricted to a certain \ufb01ring rate. This\ncan be thought of as a metabolic constraint, since the regeneration of membrane potential necessary\nfor action potential generation is one of the most signi\ufb01cant metabolic expenditures for neurons\n[17]. This will lead to a trade-off, where an increase in precision in one coordinate will result in a\ndecrease in precision in the other coordinate.\nWe consider a population of neurons whose tuning functions cover a two-dimensional space. Taking\na two-dimensional isotropic OU system with state Xt = (X1,t, X2,t)(cid:62) where both dimensions are\n2 )(cid:62) densely covering\nuncoupled, we can consider a population with tuning centres \u03b8m = (\u03b7m\nthe stimulus space. To consider a smoother class of stochastic systems we will also consider a\ntwo-dimensional stochastic oscillator with state Xt = (X1,t, V1,t, X2,t, V2,t)(cid:62), where again, both\n2 , 0)(cid:62), covering\ndimensions are uncoupled, and the tuning centres of the form \u03b8m = (\u03b7m\ndensely the position space, but not the velocity space.\nSince we are interested in the case of limited resources, we will restrict ourselves to popula-\ntions with a tuning matrix P yielding a constant population \ufb01ring rate. We can parametrise\nthese simply as POU (\u03b6) = p2 Diag(tan(\u03b6), cotan(\u03b6)), for the OU case and POsc(\u03b6) =\np2 Diag(tan(\u03b6), 0, cotan(\u03b6), 0) for the stochastic oscillator, where \u03b6 \u2208 (0, \u03c0/2). Note that this\nwill yield the \ufb01ring rate \u02c6\u03bb = 2\u03c0p\u03c6/(\u2206\u03b8)2, independent of the speci\ufb01cs of the matrix P .\nWe can then compare the performance of all observers with the same \ufb01ring rate in both control and\nestimation tasks. As mentioned, we are interested in control problems where the cost functions are\nanisotropic, that is, one dimension of the system\u2019s state vector contributes more heavily to the cost\nfunction. To study this case we consider cost functions of the type c(Xt, Ut) = Q1X 2\n2,t +\n\n1,t + Q2X 2\n\n1 , \u03b7m\n\n1 , 0, \u03b7m\n\n7\n\n0.020.040.060.080.100.120.140.160.18MMSE012345p0.0000.0010.0020.0030.0040.0050.0060.007f(0,0)0.050.100.150.200.250.300.350.40MMSE0.00.20.40.60.81.01.2p0.951.001.051.101.151.201.251.301.35f(0,t0)\fFigure 2: The differential allocation of resources in control and estimation for the OU process (left) and the\nstochastic oscillator (right). Even though the estimation MMSE leads to a symmetric optimal encoder both in\nthe Poisson and in the Kalman \ufb01ltering problem, the optimal encoders for the control problem are asymmetric,\nallocating more resources to the \ufb01rst coordinate of the stimulus.\n\n1,t +R2U 2\n\n2,t. This again, can be readily cast into the formalism introduced above, with a suitable\nR1U 2\nchoice of matrices Q and R for both the OU process as for the stochastic oscillator. We will also\nconsider the case where the \ufb01rst dimension of Xt contributes more strongly to the state costs (i.e.,\nQ1 > Q2).\nThe \ufb01ltering error can be obtained from the formalism developed in [13] in the case of Poisson\nobservations and directly from the Kalman-Bucy equations in the case of Kalman \ufb01ltering [18]. For\nLQG control, one can simply solve the control problem for the system mentioned using the standard\nmethods (see e.g. [11]). The Poisson-coded version of the control problem can be solved using\neither direct simulation of the dynamics of \u03a3s or by a mean-\ufb01eld approach which has been shown\nto yield excellent results for the system at hand. These results are summarised in \ufb01gure 2, with\nsimilar notation to that in \ufb01gure 1. Note the extreme example of the stochastic oscillator, where the\noptimal encoder is concentrating all the resources in one dimension, essentially ignoring the second\ndimension.\n\n4 Conclusion and Discussion\n\nWe have here shown that the optimal encoding strategies for a partially observed control problem is\nnot the same as the optimal encoding strategy for the associated state estimation problem. Note that\nthis is a natural consequence of considering noise covariances with a constant determinant in the\ncase of Kalman \ufb01ltering and LQG control, but it is by no means trivial in the case of Poisson-coded\nprocesses. For a class of stochastic processes for which the certainty equivalence principle holds we\nhave provided an exact expression for the optimal cost-to-go and have shown that minimising this\ncost provides us with an encoder that in fact minimises the incurred cost in the control problem.\nOptimality arguments are central to many parts of computational neuroscience, but it seems that\npartial observability and the importance of combining adaptive state estimation and control have\nrarely been considered in this literature, although supported by recent experiments. We believe the\npresent work, while treating only a small subset of the formalisms used in neuroscience, provides a\n\ufb01rst insight into the differences between estimation and control. Much emphasis has been placed on\ntracing the parallels between the two (see [19, 20], for example), but one must not forget to take into\naccount the differences as well.\n\n5 Acknowledgements\n\nThe research of A.S. was supported by the DFG research training group Sensory Computation in\nNeural Systems GRK1589-1. The research of R.M. was partially funded by the Technion V.P.R.\nfund and by the Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI).\nM.O. would like to thank for the support by EU grant FP7-ICT-270327 (ComPLACS).\n\n8\n\n0.60.650.70.750.80.850.9MMSE0.00.20.40.60.81.01.21.41.60.2550.260.2650.270.2750.280.2850.290.2950.3f(0,t0)Poisson MMSEKalman MMSEMean Field fStochastic fLQG f0.20.40.60.81.01.21.4MMSE0.00.20.40.60.81.01.21.41.60.060.070.080.090.100.110.120.130.140.15f(0,t0)PoissonMMSEKalmanMMSEMeanFieldfStochasticfLQGf\fReferences\n[1] Jun Izawa and Reza Shadmehr. On-line processing of uncertain information in visuomotor control. The\nJournal of neuroscience : the of\ufb01cial journal of the Society for Neuroscience, 28(44):11360\u20138, October\n2008.\n\n[2] Emanuel Todorov and Michael I Jordan. Optimal feedback control as a theory of motor coordination.\n\nNature neuroscience, 5(11):1226\u201335, November 2002.\n\n[3] Peter W Battaglia and Paul R Schrater. Humans trade off viewing time and movement duration to improve\nvisuomotor accuracy in a fast reaching task. The Journal of neuroscience : the of\ufb01cial journal of the\nSociety for Neuroscience, 27(26):6984\u201394, June 2007.\n\n[4] Boris Rostislavovich Andrievsky, Aleksei Sera\ufb01movich Matveev, and Aleksandr L\u2019vovich Fradkov. Con-\ntrol and estimation under information constraints: Toward a uni\ufb01ed theory of control, computation and\ncommunications. Automation and Remote Control, 71(4):572\u2013633, 2010.\n\n[5] Hans S Witsenhausen. A counterexample in stochastic optimum control. SIAM Journal on Control,\n\n6(1):131\u2013147, 1968.\n\n[6] Edison Tse and Yaakov Bar-Shalom. An actively adaptive control for linear systems with random param-\n\neters via the dual control approach. Automatic Control, IEEE Transactions on, 18(2):109\u2013117, 1973.\n\n[7] Yaakov Bar-Shalom and Edison Tse. Dual Effect, Certainty Equivalence, and Separation in Stochastic\n\nControl. IEEE Transactions on Automatic Control, (5), 1974.\n\n[8] D. Huber, D. A. Gutnisky, S. Peron, D. H. O\u2019Connor, J. S. Wiegert, L. Tian, T. G. Oertner, L. L. Looger,\nand K. Svoboda. Multiple dynamic representations in the motor cortex during sensorimotor learning.\nNature, 484(7395):473\u2013478, Apr 2012. n2123 (unprinted).\n\n[9] AA Mattar, Mohammad Darainy, David J Ostry, et al. Motor learning and its sensory effects: time course\nof perceptual change and its presence with gradual introduction of load. J Neurophysiol, 109(3):782\u2013791,\n2013.\n\n[10] S. Vahdat, M. Darainy, T.E. Milner, and D.J. Ostry. Functionally speci\ufb01c changes in resting-state senso-\n\nrimotor networks after motor learning. J Neurosci, 31(47):16907\u201316915, 2011.\n\n[11] Karl J. \u02daAstr\u00a8om. Introdution to Stochastic Control Theory. Courier Dover Publications, Mineola, NY, 1st\n\nedition, 2006.\n\n[12] Steve Yaeli and Ron Meir. Error-based analysis of optimal tuning functions explains phenomena observed\n\nin sensory neurons. Frontiers in computational neuroscience, 4(October):16, 2010.\n\n[13] Alex Susemihl, Ron Meir, and Manfred Opper. Dynamic state estimation based on Poisson spike trains\n- towards a theory of optimal encoding. Journal of Statistical Mechanics: Theory and Experiment,\n2013(03):P03009, March 2013.\n\n[14] Fumio Kanaya and Kenji Nakagawa. On the practical implication of mutual information for statistical\n\ndecisionmaking. IEEE transactions on information theory, 37(4):1151\u20131156, 1991.\n\n[15] N Merhav. Optimum estimation via gradients of partition functions and information measures: a\nstatistical-mechanical perspective. Information Theory, IEEE Transactions on, 57(6):3887\u20133898, 2011.\n[16] Dongning Guo, Shlomo Shamai, and Sergio Verd\u00b4u. Mutual information and minimum mean-square error\n\nin gaussian channels. Information Theory, IEEE Transactions on, 51(4):1261\u20131282, 2005.\n\n[17] David Attwell and Simon B Laughlin. An energy budget for signaling in the grey matter of the brain.\n\nJournal of Cerebral Blood Flow & Metabolism, 21(10):1133\u20131145, 2001.\n\n[18] R. S. Bucy. Nonlinear \ufb01ltering theory. Automatic Control, IEEE Transactions, 10(2):198, 1965.\n[19] Rudolph Emil Kalman. A new approach to linear \ufb01ltering and prediction problems. Journal of basic\n\nEngineering, 82(1):35\u201345, 1960.\n\n[20] Emanuel Todorov. General duality between optimal control and estimation. 2008 47th IEEE Conference\n\non Decision and Control, (5):4286\u20134292, 2008.\n\n9\n\n\f", "award": [], "sourceid": 1562, "authors": [{"given_name": "Alex", "family_name": "Susemihl", "institution": "Berlin Institute of Technology"}, {"given_name": "Ron", "family_name": "Meir", "institution": "Technion"}, {"given_name": "Manfred", "family_name": "Opper", "institution": "TU Berlin"}]}