{"title": "Associative memory in realistic neuronal networks", "book": "Advances in Neural Information Processing Systems", "page_first": 237, "page_last": 244, "abstract": null, "full_text": "Associative memory in realistic neuronal \n\nnetworks \n\nP.E. Latham* \n\nDepartment of Neurobiology \n\nUniversity of California at Los Angeles \n\nLos Angeles, CA 90095 \n\npel@ucla.edu \n\nAbstract \n\nAlmost two decades ago, Hopfield [1] showed that networks of \nhighly reduced model neurons can exhibit multiple attracting fixed \npoints, thus providing a substrate for associative memory. It is still \nnot clear, however, whether realistic neuronal networks can support \nmultiple attractors. The main difficulty is that neuronal networks \nin vivo exhibit a stable background state at low firing rate, typ(cid:173)\nically a few Hz. Embedding attractor is easy; doing so without \ndestabilizing the background is not. Previous work [2, 3] focused \non the sparse coding limit, in which a vanishingly small number of \nneurons are involved in any memory. Here we investigate the case \nin which the number of neurons involved in a memory scales with \nthe number of neurons in the network. In contrast to the sparse \ncoding limit, we find that multiple attractors can co-exist robustly \nwith a stable background state. Mean field theory is used to under(cid:173)\nstand how the behavior of the network scales with its parameters, \nand simulations with analog neurons are presented. \n\nOne of the most important features of the nervous system is its ability to perform \nassociative memory. It is generally believed that associative memory is implemented \nusing attractor networks - experimental studies point in that direction [4- 7], and \nthere are virtually no competing theoretical models. Perhaps surprisingly, however, \nit is still an open theoretical question whether attractors can exist in realistic neu(cid:173)\nronal networks. The \"realistic\" feature that is probably hardest to capture is the \nsteady firing at low rates - the background state - that is observed throughout the \nintact nervous system [8- 13]. The reason it is difficult to build an attractor network \nthat is stable at low firing rates, at least in the sparse coding limit, is as follows \n[2,3]: \n\nAttractor networks are constructed by strengthening recurrent connections among \nsub-populations of neurons. The strengthening must be large enough that neurons \nwithin a sub-population can sustain a high firing rate state, but not so large that the \nsub-population can be spontaneously active. This implies that the neuronal gain \nfunctions - the firing rate of the post-synaptic neurons as a function of the average \n\n\u2022 http) / culture.neurobio.ucla.edu/ \"'pel \n\n\ffiring rate of the pre-synaptic neurons - must be sigmoidal: small at low firing rate \nto provide stability, high at intermediate firing rate to provide a threshold (at an \nunstable equilibrium), and low again at high firing rate to provide saturation and \na stable attractor. In other words, a requirement for the co-existence of a stable \nbackground state and multiple attractors is that the gain function of the excitatory \nneurons be super linear at the observed background rates of a few Hz [2,3]. However \n- and this is where the problem lies - above a few Hz most realistic gain function \nare nearly linear or sublinear (see, for example, Fig. Bl of [14]). \nThe superlinearity requirement rests on the implicit assumption that the activity \nof the sub-population involved in a memory does not affect the other neurons in \nthe network. While this assumption is valid in the sparse coding limit , it breaks \ndown in realistic networks containing both excitatory and inhibitory neurons. In \nsuch networks, activity among excitatory cells results in inhibitory feedback. This \nfeedback, if powerful enough, can stabilize attractors even without a saturating \nnonlinearity, essentially by stabilizing the equilibrium (above considered unstable) \non the steep part of the gain function. The price one pays, though, is that a \nreasonable fraction of the neurons must be involved in each of the memories, which \ntakes us away from the sparse coding limit and thus reduces network capacity [15]. \n\n1 The model \n\nA relatively good description of neuronal networks is provided by synaptically cou(cid:173)\npled, conductance-based neurons. However, because communication is via action \npotentials, such networks are difficult to analyze. An alternative is to model neu(cid:173)\nrons by their firing rates. While this is unlikely to capture the full temporal network \ndynamics [16], it is useful for studying equilibria. In such simplified models, the \nequilibrium firing rate of a neuron is a function of the firing rates of all the other \nneurons in the network. Letting VEi and VIi denote the firing rates of the excita(cid:173)\ntory and inhibitory neurons, respectively, and assuming that synaptic input sums \nlinearly, the equilibrium equations may be written \n\n\u00a2Ei (~Af;EVEj'~Af;'V'j) \n\u00a2;; (~AifVEj, ~ Ai!V,j) . \n\n(la) \n\n(lb) \n\nHere \u00a2E and \u00a2I are the excitatory and inhibitory gain functions and Aij determines \nthe connection strength from neuron j to neuron i. The gain functions can, in \nprinciple, be derived from conductance-based model equations [17]. \n\nOur goal here is to determine under what conditions Eq. (1) allows both attractors \nand a stable state at low firing rate. To accomplish this we will use mean field \ntheory. While this theory could be applied to the full set of equations, to reduce \ncomplexity we make a number of simplifications. First, we let the inhibitory neurons \nbe completely homogeneous (\u00a2Ii independent of i and connectivity to and from \ninhibitory neurons all-to-all and uniform). In that case, Eq. (lb) becomes simply \nVI = \u00a2(VE' VI) where VE and VI are the average firing rates of the excitatory and \ninhibitory neurons. Solving for VI and inserting the resulting expression into Eq. (la) \nresults in the expression VEi = \u00a2Ei(LjAijEVEj,AEIVI(VE)) where A EI == LjAijI. \n\n\fSecond, we let cP Ei have the form cP Ei (u, v) = cP E( Xi + bu - ev) where Xi is a Gaussian \nrandom variable, and similarly for cPT (except with different constants band e and \nno dependence on i). Finally, we assume that cPT is threshold linear and the network \noperates in a regime in which the inhibitory firing rate is above zero. With these \nsimplifications, and a trivial redefinition of constants, Eq. (la) becomes \n\n(2) \n\nWe have dropped the sub and superscript E, since Eq. (2) refers exclusively to \nexcitatory neurons, defined v to be the average firing rate, v == N-1 Li Vi, and \nrescaled parameters. We let the function cP be 0(1), so f3 can be interpreted as the \ngain. The parameter p is the number of memories. The reduction from Eq. (1) to \nEq. (2) was done solely to simplify the analysis; the techniques we will use apply \nequally well to the general case, Eq. (1). \nNote that the gain function in Eq. (2) decreases with increasing average firing rate, \nsince it's argument is -(1 + a)v and a is positive. This negative dependence on v \narises because we are working in the large coupling regime in which excitation and \ninhibition are balanced [18,19]. The negative coupling to firing rate has important \nconsequences for stability, as we will see below. \n\nWe let the connectivity matrix have the form \n\nHere N is the number of excitatory neurons; Cij , which regulates the degree of \nconnectivity, is lie with probability e and and 0 with probability (1 - e) (except \nCii = 0, meaning no autapses); g(z) is an 0(1) clipping function that keeps weights \nfrom falling below zero or getting too large; (g) is the mean value of g(z), defined \nin Eq. (4) below; W i j , which corresponds to background connectivity, is a random \nmatrix whose elements are Gaussian distributed with mean 1 and variance 8w 2 ; and \nJij produces the attractors. We will follow the Hopfield prescription and write Jij \nas \n\n(3) \n\nwhere f is the coupling strength among neurons involved in the memories, and the \npatterns TJ\",i determine which neurons participate in each memory. The TJ\",i are a \nset of uncorrelated vectors with zero mean and unit variance. In simulations we \nuse TJ\",i = [(1 - 1)11]1/2 with probability 1 and -(f 1(1 - IW /2 with probability \n1 - I, so a fraction 1 of the neurons are involved in each memory. Other choices \nare unlikely to significantly change our results. \n\n2 Mean field equations \n\nThe main difficulty in deriving the mean field equations from Eq. (2) is separating \nthe signal from the noise. Our first step in this endeavor is to analyze the noise \n\n\fassociated with the clipped weights. To do this we break Cijg(Wij + Jij ) into two \npieces: Cijg(Wij + Jij) = (g) + (g')Jij + bCij where \n\nThe angle brackets around 9 represent an average over the distributions of W ij and \nJij, and a prime denotes a derivative. In the large p limit, bCij can be treated as a \nrandom matrix whose main role is to increase the effective noise [20]. The mean of \nbCij is zero and its variance normalized to (g)2 / c, which we denote (Y2, is given by \n\nFor large p, the elements of Jij are Gaussian with zero mean and variance E2, so \nthe averages involving 9 can be written \n\n(4) \n\nwhere k can be either an exponent or a prime and the \"I\" in g(1 + z) corresponds \nto the mean of Wij . In our simulations we use the clipping function g(z) = z if z is \nbetween 0 and 2, 0 if z ::::; 0 and 2 if z ;::: 2. \nOur main assumptions in the development of a mean field theory are that \nL;#i bCijvj is a Gaussian random variable, and that bCij and Vj are independent. \nConsequently, \n\nwhere (v 2 ) == N- 1 L;i v; is the second moment of the firing rate. Letting 8i be a \nzero mean Gaussian random variable with variance 82 == (Y2 (v2) / cN, we can use the \nabove assumptions along with the definition of Jij , Eq. (3), to write Eq. (20) as \n\n(5) \n\nWe have defined the clipped memory strength, Ee , as Ee == E(g')/(g). While it is \nnot totally obvious from the above equations, it can be shown that both (Y2 and \nEe become independent of E for large E. This makes network behavior robust to \nchanges in E, the strength of the memories, so long as E is large. \n\nDerivation ofthe mean field equations from Eq. (5) follow standard methods [21,22]. \nFor definiteness we take \u00a2(x) to be threshold linear: \u00a2(x) = max(O, x). For the case \nof one active memory, the mean field equations may then be written in the form \n\n\fw \n\n1 \n\nr \n\nq \n\n+ \n\n{3Ec \n) \n1- r flF1 w,z \n\n( \n\n(32E~ \n\n[1J2 \n\n1] \n\na(l-r)2 CE~+(1-q)2 [F2(z)+jflF2(w ,z)] \n{32B2a2/x2 \n(1 ~ r)2 a [Fl (z) + j flFl (w, zW \n\na{3Ecq \n1-q \n\n(3E~ [Fo(z) + jflFo(w,z)] \n\n1 + a Ec \n\n(6a) \n\n(6b) \n\n(6c) \n\n(6d) \n\nwhere a == piN is the load parameter, Xo and B6/P are the mean and variance of \nof Xi (see Eq. (2)), and, recall, j is the fraction of neurons that participate in each \nmemory. The functions Fk and flFk are defined by \n\n100 \n\nd~ \n\nk \n\n2 \n\n-z (27r )1/2 (z +~) exp( -~ /2) \nFdw + z) - Fk(Z) . \n\nFor large negative z, Fk(z) vanishes as exp(-z2/2) , while for large positive z, \nFk(Z) --+ zk /k!. \nThe average firing rate, v, and strength of the memory, m == N- 1 2::i rJljVj (taken \nwithout loss of generality to be the overlap with pattern 1), are given in terms of z \nand was \n\nXo \n\nv \n\nm \n\n3 Results \n\nThe mean field equations can be understood by examining Eqs. (6a) and (6b). The \nfirst of these, Eq. (6a), is a rescaled form of the equation for the overlap, m. (From \nthe definition of flFt given above, it can be seen that m is proportional to w for \nsmall w). This equation always has a solution at w = 0 (and thus m = 0) , which \ncorresponds to a background state with no memories active. If {3Ec is large enough, \nthere is a second solution with w (and thus m) greater than zero. This second \nsolution corresponds to a memory. The other relevant equation, Eq. (6b), describes \nthe behavior of the mean firing rate. This equation looks complicated only because \nthe noise - the variation in firing rate from neuron to neuron - must be determined \nself-consistently. \nThe solutions to Eqs. (6a) and (6b) are plotted in Fig. 1 in the z-w plane. The solid \nlines, including the horizontal line at w = 0, represents the solution to Eq. (6a), the \n\n\fw \n\n, \n~ \n',.: \n\n... \nt \n\nFigure 1: Graphical solution of Eqs. (6a) \nand (6b). Solid lines, including the one at \nw = 0: solution to Eq. (6a). Dashed line: \nsolution to Eq. (6b). The arrows indicate \napproximate flow directions: vertical ar(cid:173)\nrows indicate time evolution of w at fixed \nz; horizontal arrows indicate time evolu(cid:173)\ntion of z at fixed w. The black squares \nshow potentially stable fixed points. Note \nthe exchange of stability to the right of \nthe solid curve, indicating that intersec(cid:173)\ntions too far to the right will be unstable. \n\nt \n... \n\nw=o \n\nz \n\ndashed line the solution to Eq. (6b), and their intersections solutions to both. While \nstability cannot be inferred from the equilibrium equations, a reasonable assumption \nis that the evolution equations for the firing rates, at least near an equilibrium, have \nthe form Tdvi/dt = \u00a2i - Vi. In that case, the arrows represent flow directions, and \nwe see that there are potentially stable equilibria at the intersections marked by \nthe solid squares. \nNote that in the sparse coding limit, f ---+ 0, z is independent of w, meaning that the \nmean firing rate, v , is independent of the overlap, m. In this limit there can be no \nfeedback to inhibitory neurons, and thus no chance for stabilization. In terms of Fig. \n1, the effect of letting f ---+ 0 is to make the dashed line vertical. This eliminates the \npossibility of the upper stable equilibrium (the solid square at w > 0), and returns \nus to the situation where a superlinear gain function is required for attractors to be \nembedded, as discussed in the introduction. \n\nTwo important conclusions can be drawn from Fig. 1. First, the attractors can be \nstable even though the gain functions never saturate (recall that we used threshold(cid:173)\nlinear gain functions). The stabilization mechanism is feedback to inhibitory neu(cid:173)\nrons, via the -(1 + a)v term in Eq. (2). This feedback is what makes the dashed \nline in Fig. 1 bend, allowing a stable equilibrium at w > O. Second, if the dashed \nline shifts to the right relative to the solid line, the background becomes destabi(cid:173)\nlized. This is because there is an exchange of stability, as indicated by the arrows. \nThus, there is a tradeoff: w, and thus the mean firing rate of the memory neurons, \ncan be increased by shifting the dashed line up or to the right, but eventually the \nbackground becomes destabilized. Shifting the dashed line to the left, on the other \nhand, will eventually eliminate the solution at w > 0, destroying all attractors but \nthe background. \nFor fixed load parameter Ct, fraction of neurons involved in a memory, f, and degree \nof connectivity, c, there are three parameters that have a large effect on the location \nof the equilibria in Fig. 1: the gain, {3, the clipped memory strength, fe, and the \ndegree of heterogeneity in individual neurons, Bo. The effect of the first two can \nbe seen in Fig. 2, which shows a stability plot in the f-{3 plane, determined by \nnumerically solving the the equations Tdvi/dt = \u00a2i - Vi (see Eq. (2)). The filled \ncircles indicate regions where memories were embedded without destabilizing the \nbackground, open circles indicate regions where no memories could be embedded, \nand xs indicate regions where the background was unstable. As discussed above, \nfe becomes approximately independent of the strength of the memories, f, when \nf becomes large. This is seen in Fig. 2A, in which network behavior stabilizes \nwhen f becomes larger than about 4; increasing f beyond 8 would, presumably, \n\n\fproduce no surprises. There is some sensitivity to gain, (3: when f > 4, memories \nco-existed with a stable background for (3 in a \u00b115% range. Although not shown, \nthe same was true of eo: \nincreasing it by about 20% eliminated the attractors; \ndecreasing it by the same amount destabilized the background. However, more \ndetailed analysis indicates that the stability region gets larger as the number of \nneurons in the network, N, increases. This is because fluctuations destabilize the \nbackground, and those fluctuations fall off as N - 1 / 2 . \n\nA \n\n70 \n\nB \n\n'.2[\\momoo \n\nE: 11111\",1 11 \n\no 000 0000000 00 000 \u2022 \u2022 \u2022 \u2022 \no \n4 \n\n2 \n~ \n\nN \n!:S 35 \n~ \n\n0 \n\n0 \n\nI background \n\n4 \nE \n\n8 \n\nFigure 2: A. Stability diagram, found by solving the set of equations Tdv;/dt = \ncPi - Vi with the argument of cPi given in Eq. (2). Filled circles: memories co-exist \nwith a stable background (also outlined with solid lines); open circles: memories \ncould not be embedded; x s: background was unstable. The average background \nrate, when the background was stable, was around 3 Hz. The network parameters \nwere eo = 6, Xo = 1.5, a = 0.5, c = 0.3, 0: = 2.5%, and 8w = 0.3. 2000 neurons \nwere used in the simulations. These parameters led to an effective gain, pl /2 (3f c , of \nabout 10, which is consistent with the gain in large networks in which each neuron \nreceives \"-'5-10,000 inputs. B . Plot of firing rate of memory neurons, m, when the \nmemory was active (upper trace) and not active (lower trace) versus f at (3 = 2. \n\n4 Discussion \n\nThe main outcome of this analysis is that attractors can co-exist with a stable \nbackground when neurons have generic threshold-linear gain functions, so long as \nthe sparse coding limit is avoided. The parameter regime for this co-existence is \nmuch larger than for attractor networks that operate in the sparse coding limit \n[2,23]. While these results are encouraging, they do not definitively establishing \nthat attractors can exist in realistic networks. Future work must include inhibitory \nneurons, incorporate a much larger exploration of parameter space to ensure that \nthe results are robust, and ultimately involve simulations with spiking neurons. \n\n5 Acknowledgements \n\nThis work was supported by NIMH grant #R01 MH62447. \n\nReferences \n[1] J.J. Hopfield. Neural networks and physical systems with emergent collective compu(cid:173)\n\ntational abilities. Proc. Natl. Acad. Sci., 79:2554- 2558, 1982. \n\n[2] N. BruneI. Persistent activity and the single-cell frequency-current curve in a cortical \n\nnetwork model. Network: Computation in Neural Systems, 11:261- 280, 2000. \n\n[3] P.E. Latham and S.N. Nirenberg. Intrinsic dynamics in cultured neuronal networks. \n\nSoc . Neuroscience Abstract, 25:2259, 1999. \n\n\f[4] J.M. Fuster and G.E. Alexander. Neuron activity related to short-term memory. \n\nScience, 173:652- 654, 1971. \n\n[5] Y. Miyashita. Inferior temporal cortex: where visual perception meets memory. Annu \n\nR ev Neurosci, 16:245- 263, 1993. \n\n[6] P.S. Goldman-Rakic. Cellular basis of working memory. Neuron, 14:477- 485, 1995. \n[7] R Romo, C.D. Brody, A. Hernandez, and L. Lemus. Neuronal correlates of parametric \n\nworking memory in the prefrontal cortex. Nature, 399:470- 473, 1999. \n\n[8] C.D. Gilbert. Laminar differences in receptive field properties of cells in cat primary \n\nvisual cortex. J. Physiol. , 268:391- 421 , 1977. \n\n[9] Y. Lamour, P. Dutar, and A. Jobert. Cerebral neorcortical neurons in the aged rat: \nspontaneous activity, properties of pyramidal tract neurons and effect of acetylcholine \nand cholinergic drugs. N euroscience, 16:835- 844, 1985. \n\n[10] M.B. Szente, A. Baranyi, and C.D. Woody. Intracellular injection of apamin reduces \na slow potassium current mediating afterhyperpolarizations and IPSPs in neocortical \nneurons of cats. Brain Res. , 461:64- 74, 1988. \n\n[11] I. Salimi, H.H. Webster, and RW. Dykes. Neuronal activity in normal and deaf(cid:173)\nferented forelimb somatosensory cortex of the awake cat. Brain Res., 656:263- 273, \n1994. \n\n[12] J.F. Herrero and P.M. Headley. Cutaneous responsiveness of lumbar spinal neurons \n\nin awake and halothane-anesthetized sheep. J. N europhysiol. , 74:1549- 1562, 1997. \n\n[13] K. Ochi and J.J. Eggermont. Effects of quinine on neural activity in cat primary \n\nauditory cortex. Hear. Res., 105:105- 18, 1997. \n\n[14] P.E. Latham, B.J. Richmond, P.G. Nelson, and S.N. Nirenberg. Intrinsic dynamics \n\nin neuronal networks. I. Theory. J. Neurophysiol., 83:808- 827, 2000. \n\n[15] M.V. Tsodyks and M.V. Feigel'man. The enhanced storage capacity in neural net(cid:173)\n\nworks with low activity level. Europhys. Lett. , 6:101- 105, 1988. \n\n[16] A. Treves. Mean-field analysis of neuronal spike dynamics. Network, 4:259- 284, 1993. \n[17] O. Shriki, D. Hansel, and H. Sompolonski. Modeling neuronal networks in cortex \nby rate models using the current-frequency response properties of cortical cells. Soc. \nNeurosci ence Abstract, 24:143, 1998. \n\n[18] C. van Vreeswijk and H. Sompolinsky. Chaos in neuronal networks with balanced \n\nexcitatory and inhibitory activity. Science, 274: 1724- 1726, 1996. \n\n[19] C. van Vreeswijk and H. Sompolinsky. Chaotic balanced state in a model of cortical \n\ncircuits. Neural Comput., 10:1321- 1371, 1998. \n\n[20] H. Sompolinsky. Neural networks with nonlinear synapses and a static noise. Phys. \n\nRev. A, 34:2571- 2574, 1986. \n\n[21] J. Hertz, A. Krogh, and RG. Palmer. Introduction to the theory of neural computa(cid:173)\n\ntion. Addison Wesley, Redwood City, CA, 1991. \n\n[22] A.N. Burkitt. Retrieval properties of attractor neural that obey Dale's law using \na self-consistent signal-to-noise analysis. Network: Computation in Neural Systems, \n7:517- 531 , 1996. \n\n[23] D.J. Amit and N. BruneI. Dynamics of a recurrent network of spiking neurons before \n\nand following learning. Network, 8:373- 404, 1997. \n\n\f", "award": [], "sourceid": 2056, "authors": [{"given_name": "Peter", "family_name": "Latham", "institution": null}]}