{"title": "Over-complete representations on recurrent neural networks can support persistent percepts", "book": "Advances in Neural Information Processing Systems", "page_first": 541, "page_last": 549, "abstract": "A striking aspect of cortical neural networks is the divergence of a relatively small number of input channels from the peripheral sensory apparatus into a large number of cortical neurons, an over-complete representation strategy. Cortical neurons are then connected by a sparse network of lateral synapses. Here we propose that such architecture may increase the persistence of the representation of an incoming stimulus, or a percept. We demonstrate that for a family of networks in which the receptive field of each neuron is re-expressed by its outgoing connections, a represented percept can remain constant despite changing activity. We term this choice of connectivity REceptive FIeld REcombination (REFIRE) networks. The sparse REFIRE network may serve as a high-dimensional integrator and a biologically plausible model of the local cortical circuit.", "full_text": "Over-complete representations on recurrent neural\n\nnetworks can support persistent percepts\n\nShaul Druckmann\n\nJanelia Farm Research Campus\nHoward Hughes Medical Institute\n\nAshburn, VA 20147\n\ndruckmanns@janelia.hhmi.org\n\nmitya@janelia.hhmi.org\n\nDmitri B. Chklovskii\n\nJanelia Farm Research Campus\nHoward Hughes Medical Institute\n\nAshburn, VA 20147\n\nAbstract\n\nA striking aspect of cortical neural networks is the divergence of a relatively small\nnumber of input channels from the peripheral sensory apparatus into a large num-\nber of cortical neurons, an over-complete representation strategy. Cortical neurons\nare then connected by a sparse network of lateral synapses. Here we propose that\nsuch architecture may increase the persistence of the representation of an incom-\ning stimulus, or a percept. We demonstrate that for a family of networks in which\nthe receptive \ufb01eld of each neuron is re-expressed by its outgoing connections, a\nrepresented percept can remain constant despite changing activity. We term this\nchoice of connectivity REceptive FIeld REcombination (REFIRE) networks. The\nsparse REFIRE network may serve as a high-dimensional integrator and a biolog-\nically plausible model of the local cortical circuit.\n\n1\n\nIntroduction\n\nTwo salient features of cortical networks are the numerous recurrent lateral connections within a\ncortical area and the high ratio of cortical cells to sensory input channels. In their seminal study\n[1], Olshausen and Field argued that such architecture may subserve sparse over-complete represen-\ntations, which maximize representation accuracy while minimizing the metabolic cost of spiking.\nIn this framework, lateral connections between neurons with correlated receptive \ufb01elds mediate ex-\nplaining away of the sensory input features[2]. With the exception of an Ising-like generative model\nfor the lateral connections [3] and a mutual information maximization approach [4], most theoretical\nwork on lateral connections did not focus on the representation over-completeness [5] and references\ntherein.\nHere, we propose that over-complete representations on recurrently connected networks offer a so-\nlution to a long-standing puzzle in neuroscience, that of maintaining a stable sensory percept in the\nabsence of time-invariant persistent activity (rate of action potential discharge). In order for sensory\npercepts to guide actions, their duration must extend to behavioral time scales, hundreds of millisec-\nonds or seconds if not more. However, many cortical neurons exhibit time-varying activity even\nduring working memory tasks [6, 7] and references therein. If each neuron codes for orthogonal\ndirections in stimulus space, any change in the activity of neurons would cause a distortion in the\nnetwork representation, implying that a percept cannot be maintained.\nWe point out that, in an over-complete representation, network activity can change without any\nchange in the percept, allowing persistent percepts to be maintained in face of variable neuronal\nactivity. This results from the fact that the activity space has a higher dimensionality than that of\nthe stimulus space. When the activity changes in a direction nulled by the projection onto stimulus\nspace, the percept remains invariant.\n\n1\n\n\fWhat lateral connectivity can support persistent percepts, even in the face of changing neuronal\nactivity? We derive the condition on lateral connection weights for networks to maintain persistent\npercepts, thus de\ufb01ning a family of REceptive FIeld REcombination networks. Furthermore, we\npropose that minimizing synaptic volume cost favors sparse REFIRE networks, whose properties are\nremarkably similar to that of the cortex. Such REFIRE networks act as high dimensional integrators\nof sensory input.\n\n2 Model\nWe consider n sensory neurons, their activity marked by s in Rn which project to a layer of m\ncortical neurons, where m > n. The activity of the m neurons, marked by a in Rm, at any given\ntime represents a percept of a certain stimulus. The represented percept s is a linear superposition\nof feature vectors, stacked as columns of matrix D, weighted by the neuronal activity a:\n\n(1)\nFor instance, s could represent the intensity level of pixels in a patch of the visual \ufb01eld and the\ncolumns of D a dictionary chosen to represent the patches, e.g. a set of Gabor \ufb01lters [8]. Since\nm > n, the columns of dictionary D cannot be orthogonal and hence de\ufb01ne a frame rather than a\nbasis [9].\n\ns = Da.\n\n2.1 Frames\n\nA frame is a generalization of the idea of a basis to linearly dependent elements [9]. The mapping\nbetween the activity space Rm and the sensory space Rn is accomplished by the synthesis operator,\nD. The adjoint operator DT is called the analysis operator and their composition the frame operator\nDDT . As a consequence of columns of D being a frame, a given vector in the space of percepts can\nbe represented non-uniquely, i.e. with different coef\ufb01cients expressed by neuronal activity a. The\ngeneral form of coef\ufb01cients is given by:\n\na = DT (DDT )\u22121s + a\u22a5,\n\n(2)\n\nwhere a\u22a5 belongs to the null-space of D, i.e. Da\u22a5 = 0.\nOne choice of coef\ufb01cients, called frame coef\ufb01cients, corresponds to a\u22a5 = 0 and minimizes their\nl2 norm. Alternatively one can choose a set of coef\ufb01cients minimizing the l1 norm. These can be\ncomputed by Matching Pursuit [10], Basis Pursuit [11] or LASSO [12], or by the dynamics of a\nneural network with feedforward and lateral connections [13]. In summary, the neural activity is an\nover-complete representation of the sensory percepts, the m columns of D acting as a frame for the\nspace of sensory percepts.\n\n2.2 Persistent percepts and lateral connectivity\n\nNow, we derive a necessary and suf\ufb01cient condition on the lateral connections L such that for every\na the percept represented by Equation (1) persists. We focus on the dynamics of a following a\ntransient presentation of the sensory stimulus. The dynamics of a network with lateral connectivity\nmatrix L is given by:\n\n(3)\nwhere time is measured in units of the neuronal membrane time constant. Requiring time-invariant\npersistent activity amounts to \u02d9a = 0 or\n\n\u02d9a = \u2212a + La,\n\n(4)\nHowever, this is not necessary if we require only the percept represented by the network to be \ufb01xed.\nInstead,\n\na = La.\n\n\u02d9s = D \u02d9a = D(\u2212a + La) = 0\n\n(5)\n\nThus, setting the derivative of s to zero is tantamount to\nDa = DLa.\n\nIf we require persistent percepts for any a, then:\n\nD = DL\n\n2\n\n(6)\n\n(7)\n\n\fEquation (7) has a trivial solution L = I, which corresponds to a network with no actual lateral\nconnections and only autapses. We do not consider this solution further for two reasons. First,\nautapses are extremely rare among cortical neurons[14]. Second, recurrent networks better support\npersistency than autapses [15, 16].\nThe intuition behind the derivation of Equation (7) is as follows: as the activity of each neuron\nchanges due to the \ufb01rst term in the rhs of Equation (5) its contribution to the percept may change. To\ncompensate for this change without necessarily keeping the activity \ufb01xed, we require that the other\nneurons adjust their activity according to Equation (6).\nThe condition imposed by Equation (7) on the synaptic weights can be understood as follows. For\neach neuron j the sum of its post-synaptic partners receptive \ufb01elds, weighted by the synaptic ef\ufb01cacy\nfrom neuron j to the other neurons equals to the receptive \ufb01eld of neuron j. Thus, the other neurons\nget excited by exactly the amount that it would take for them to replace the lost contribution to the\npercept. Equation (7) and its non-trivial solutions that maintain persistent percepts are the main\nresults of the present study. We term non-trivial solutions of Equation (7) REceptive FIeld RE-\nexpression, or REFIRE networks due to the intuition underlying their de\ufb01nition.\nSome patterns of activity satisfying Equation (4) will remain time-invariant themselves. These\ncorrespond to patterns spanned by the right eigenvectors of L with an eigenvalue of one. Note that\nin order to satisfy Equation (7) a right eigenvector v of L must have either an eigenvalue of one or\nbe in the null-space of D.\n\nThere are in\ufb01nitely many solutions satisfying Equation (7), since there are m \u2217 n equations and\nm \u2217 m variables in L. A general solution is given by:\n\n(8)\nwhere L\u22a5 indicates a component in L corresponding to the null-space of D i.e. DL\u22a5 = 0. We shall\nuse these degrees of freedom to require a zero diagonal for L, thus avoiding autapses.\n\nL = DT (DDT )\u22121D + L\u22a5,\n\nFigure 1: Schematic network diagram and Mercedes-Benz example. Left: Network diagram. Mid-\ndle: Directions of vectors in the MB example. Right: visualization of L\n\n2.3 An example: the Mercedes-Benz frame\n\n\u221a\n3/2 \u2212 1/2], [\n\nIn order to present a more intuitive view of the concept of persistent percepts we consider the\nMercedes-Benz frame [17]. This simple frame spans the R2 plane with three frame elements:\n[0 1], [\u2212\u221a\n3/2 \u2212 1/2]. In this case, the frame operator DDT has a partic-\nularly simple form, being proportional to the identity matrix, indicating that the frame is tight. The\n\ufb01rst term in the general form of L (Equation (8)) has a non-zero diagonal, which can be removed by\nadding L\u22a5, a matrix with all its entries equal to one (times a scalar). Thus, L is:\n\n(cid:32) 0 \u22121 \u22121\n\n(cid:33)\n\nL =\n\n\u22121\n\u22121 \u22121\n\n0 \u22121\n0\n\nThis seems a rather unlikely candidate matrix to support persistent percepts. However, consider\nstarting out with the vector a0 = [1\n1] on the plane, after con-\nvergence of the dynamics we have a = [2/3 \u2212 1/3 \u2212 1/3]. This new activity vector represents\n\n0] representing the point [0\n\n0\n\n3\n\n-1-1-1-1-1-1aaaaaasssaDL(mx1)(nx1)(mxm)(mxn)T\fexactly the same point on the plane: Da = [0\n1]. Thus, the percept, the point on the plane,\nremained constant despite changing neuronal activity. Note that some patterns of activity will re-\nmain strictly persistent themselves. These correspond to vectors which are a linear combination\nof the right eigenvectors of L with an eigenvalue of one.\nIn this case, these eigenvectors are:\nv1 = [\u22121\n\n1 0],v2 = [1/2 1/2 \u2212 1].\n\n2.4 The sparse REFIRE network\n\nWhich members of the family of REFIRE networks obeying equation (7) are most likely to model\ncortical networks? In the cortex, the connectivity is sparse and the synaptic weights are distributed\nexponentially [18, 19]. These measurements are consistent with minimizing cost proportional to\nsynaptic weight, such as for example their volume. Motivated by these observations, we choose\neach column of L as a sparse representation of each individual dictionary element by every other\nelement. De\ufb01ne Dj = d1, d2, . . . dj\u22121, dj+1 . . . dm. We shall denote the sparse approximation\ncoef\ufb01cients by \u03b2. Therefore:\n\n||dj \u2212 Dj\u03b2j||2\n\n2 + \u03bb||\u03b2j||1\n\n\u03b2\u2217\nj = min\n\n\u03b2j\u2208Rm\u22121\n\n(9)\nThese are vectors in Rm\u22121, we now need to insert a zero in the position of the dictionary element\nthat was extracted for each of these vectors. Denote by \u02dc\u03b2j a vector where a zero before the jth\nlocation of \u03b2j, resulting in a vector in Rm. The connectivity of our model network is given by\nL = [ \u02dc\u03b21, \u02dc\u03b22, . . . \u02dc\u03b2m] in Rmxm.\nWe call this form of L the sparse REFIRE network. Similar networks were previously constructed\non the raw data (or image patches) [20, 21], while sparse REFIRE networks re\ufb02ect the relation-\nship among dictionary elements. Previously, the dependencies between dictionary elements were\ncaptured by tree-graphs [22, 23].\n\n3 Results\n\nIn this section, we apply our model to the primary visual cortex by modeling the receptive \ufb01elds\nfollowing the approach of [1]. We study the properties of the resulting sparse REFIRE network and\ncompare them with experimentally established properties of cortical networks.\n\n3.1 Constructing the sparse REFIRE network for visual cortex\n\nWe learn the sparse REFIRE network from a standard set of natural images [8]. We extract patches\nof size 13x13 pixels. We use a set of 100,000 such patches distributed evenly across different natural\nimages to learn the model. Whitening was performed through PCA, after the DC component of each\npatch was removed. The dimensionality was reduced from 169 to 84 dimensions. We learn a four\ntimes over-complete dictionary, via the SPAMS online sparse approximation toolbox [24]. Figure 2\nleft shows the forward weights (columns of D) learned. As expected, the \ufb01lters obtained are edge\ndetectors differing in scale, spatial location and orientation.\nThe sparse REFIRE network was then learned from the dictionary using the same toolbox. Parameter\n\u03bb in equation (9) governs the tradeoff between sparsity and reconstruction \ufb01delity, \ufb01gure 2 right. We\nveri\ufb01ed that the results presented in this study do not qualitatively change over a wide range of \u03bb\nand chose the value of \u03bb where the average probability of connection was 9%, in agreement with the\nexperimental number of approximately 10%. For this choice the relative reconstruction mismatch\nwas approximately 10\u22123. The distribution of synaptic weights in the network, Figure 3 left, shows a\nstrong bias to zero valued connections and a heavier than gaussian tail as does the cortical data [25].\nFor an enlarged view of the network see Figure 7. From here on we consider that particular choice\nwhen we refer to the sparse REFIRE network.\nRemarkably, the real part of all eigenvalues is less than or equal to one, Figure 3 right, indicating\nstability of network dynamics. Although equation (7) guarantees that n eigenvalues are equal to\none, it does not rule out the existence of eigenvalues with greater real part. We speculate that the\nabsence of such eigenvalues in the spectrum is due to the l1 term in equation (9), the minimization\nof which could be viewed as a shrinkage of Gershgorin circles. We \ufb01nd that the connectivity learned\n\n4\n\n\fFigure 2: The sparse REFIRE network. Left: the patches corresponding to columns of D sorted by\nvariance. Right: Summed l1-norm of all columns of L (left y-axis, red), the reconstruction mismatch\n|(D \u2212 DL)|/|D| (right y-axis, blue) as a function of \u03bb. Dashed line indicates the value of \u03bb chosen\nfor the sparse REFIRE network.\n\nwas asymmetric with substantial imaginary components in the eigenvalues, see Figure 3 right. In\ngeneral, the sparse REFIRE network is unlikely to be symmetric because the connection weights\nbetween a pair of neurons are not decided based solely on the identity of the neurons in the pair but\nare dependent on other connections of the same pre-synaptic neuron.\n\nFigure 3: Properties of lateral connections. Left: distribution of lateral connectivity weights. Inset\nshows a survival plot with logarithmic y-axis and same axes limits. Right: scatter plot of eigenvalues\nof the lateral connectivity matrix. Note that there are many eigenvalues at real value one, imaginary\nvalue zero. Histogram shown below plot\n\nNumerical simulations of the dynamics of a recurrent network with connectivity matrix L con\ufb01rm\nthat the percept remains stable during the network dynamics. We chose an image patch at random\nand simulated the network dynamics. As can be seen in Figure 4 left, despite signi\ufb01cant changes\nin the activity of the neurons, the percept encoded by the network remained stable, PSNR between\noriginal image and image after dynamics lasting 100 neuronal time constants: 45.5dB. The dynamics\nof the network desparsi\ufb01ed the representation (Figure 4 right). Averaged across multiple patches,\nthe value of each coef\ufb01cient in the sparse representation was 0.0704, while after the network dy-\nnamics this increased to 0.0752, though still below the value obtained for the frame coef\ufb01cients\nrepresentation which was 0.0814.\n\n3.2 Computational advantages of the sparse REFIRE network\n\nIn this section, we consider possible computational advantages for the de-coupling between the\nsensory percept and it representation by neuronal activity. Speci\ufb01cally, we address a shortcoming\nof the sparse representation, its lack of robustness [13]. Namely, the fact that stimuli that differ\nonly to a small degree might end up being represented with very different coef\ufb01cients. Intuitively\nspeaking, this may occur when two (or more) dictionary elements compete for the same role in the\n\n5\n\n10-510-410-310-210-100.0050.010.0150.020.0250.030.035LambdaRelative reconstruction mismatch50010001500200025003000Summed l-one length of L00.20.40.60.811.2050010001500200011500Connection weightCountConnection weightWeight survival func.012310-610-410-2100-1.5-1-0.500.51-0.6-0.20.20.6Real part of eigenvalueImaginary part of eigenvalue\fFigure 4: Evolution of neuronal activity in time. Left: activity of a subset of neurons over time. Top\nshows the original percept (framed in black) and plotted left to right patches taken from consecutive\npoints in the dynamics. Right: scatter of the coef\ufb01cients before and after 400 neuronal time constants\nof the dynamics.\n\nsparse representation. To arrive at a sparse approximation of the stimuli either one of the dictionary\nelements could potentially be used, but due to the high cost of non-sparseness both of them together\nare not likely to be chosen in a given representation. Thus, small changes in the image, as might\narise due to various noise sources, might cause one of the coef\ufb01cients to be preferred over the other\nin an essentially random fashion, potentially resulting in very different coef\ufb01cient values for highly\nsimilar images.\nThe dynamics of the sparse REFIRE network improve the robustness of the coef\ufb01cient values in the\nface of noise. In order to model this effect we extract a single patch and corrupt it repeatedly with i.i.d\n5% Gaussian noise. Figure 5 left shows two patches with similar orientation. Figure 5 middle shows\nthe values of these two coef\ufb01cients for the sparse approximation taken across the different noise\nrepetitions. As can be clearly seen only one or the other of the two coef\ufb01cients is used, exemplifying\nthe competition described above. The resulting \ufb02ickering in the coef\ufb01cients exempli\ufb01es this lack of\nrobustness. Note that the true lack of robustness arises due to multicollinear relations between the\ndifferent dictionary elements. Here we restrict ourselves to two in the interests of clarity. Figure\n5 right shows these coef\ufb01cient values plotted one against the other in red along with the values of\nthe two coef\ufb01cients following the model dynamics in blue. In the latter case, the coef\ufb01cient values\nbetween different repetitions remain fairly constant and the \ufb02ickering representation as in Figure 5\nmiddle is abolished.\nWe further examined the utility of a more stable representation by training a Naive Bayes classi\ufb01er to\ndiscriminate between noisy versions of two patches. We corrupt the two patches with i.i.d noise and\ntrain the classi\ufb01er on 75% of the data while reserving the remaining data for testing generalization.\nWe train one of classi\ufb01er on the sparse representation and the other on the representation following\nthe dynamics of the sparse REFIRE network. We \ufb01nd that the generalization of the classi\ufb01er learned\nfollowing the dynamics was indeed higher, providing 92% accuracy, while the sparse coef\ufb01cient\ntrained classi\ufb01er scored 83% accuracy.\nWe then demonstrate the computational advantages of the sparse REFIRE network in a more realistic\nscenario, encoding a set of patches extracted from an image by shifting the patch one pixel at a\ntime. Such a shift can be caused by \ufb01xational drift or slow self-movement. Figure 5 right top\nshows a subset of the patches extracted in this fashion. For each of the patches we calculate the\nsparse approximation coef\ufb01cients and then determine the dot product between the representation\nof consecutive patches. We then take the same coef\ufb01cients, evolve them through the dynamics of\nthe sparse REFIRE network network and compute the dot product between these new coef\ufb01cients.\nFigure 5 right bottom shows the normalized dot product, the value of the dot product between the\ncoef\ufb01cients of two consecutive patches after the sparse REFIRE network dynamics, divided by the\nsame dot product between the original coef\ufb01cients. As can be seen, for nearly all cases the ratio is\nhigher than one, indicating a smoother transition between the coef\ufb01cients of the consecutive patches.\n\n6\n\n010203040Time, units of neuro. time cons.Activity a.uActivity after dynamics a.u.Activity before dynamics a.u.0-0.100.1\fFigure 5: Sparse REFIRE network dynamics enhances the robustness of representation. Left: the\npatches corresponding to two columns of D with similar tuning. Followed by the coef\ufb01cient of\neach of the patch in the representation of the different noisy image instantiations and a scatter plot\nof the coef\ufb01cient values before recurrent dynamics (red) and following (blue) recurrent dynamics.\nRight: an example of the patches in the sliding frame (top) and the normalized dot product between\nconsecutive patches.\n\nFigure 6: Dictionary clustering. Clusters of patches obtained by a three-way sparse REFIRE network\npartitioning by normalized cut. Note the mainly horizontal orientation of the \ufb01rst set of patches and\nthe vertical orientation of the second.\n\nThe sparse REFIRE network encodes useful information regarding the relation between the different\ndictionary elements. This can be probed by partitioning performed on the graph [20]. Figure 6 shows\nthe components of a normalized cut performed on the sparse REFIRE network. The left group\nshows clear bias towards horizontal orientation tuning, the middle towards vertical. Thus, subspaces\ncan be learned directly from partitioning on the sparse REFIRE network offering a complementary\napproach to learning structured models directly from the data [26, 27].\nFinally, the sparse REFIRE network serves as an integrator of the sensory input. Eigenspace of the\nunit eigenvalue is a multi-dimensional generalization of the line attractor used to model persistent\nactivity [16]. However, unlike the persistent activity theory, which focuses on dynamics along the\nline attractor, we emphasize the transient dynamics approaching the unitary eigenspace.\n\n4 Discussion\n\nThis study makes a number of novel contributions. First, we propose and demonstrate that in an\nover-complete representation certain types of network connectivity allow the percept, i.e. the stim-\nulus represented by the network activity, to remain \ufb01xed in time despite changing neuronal activity.\nSecond, we propose the sparse REFIRE network as a biologically plausible model for cortical lateral\nconnections that enables such persistent percepts. Third, we point out that the ability to manipulate\nactivity without affecting the accuracy of representation can be exploited in order to achieve compu-\ntational goals. As an example, we show that the sparse REFIRE network dynamics, though causing\nthe representation to be less sparse, alleviates the problem of representation non-robustness.\nAlthough this study focused on sensory representation in the visual cortex, the framework can be\nextended to other sensory modalities, motor cortex and, perhaps, even higher cognitive areas such\nas prefrontal cortex or hippocampus.\n\n7\n\n012345012345Coe(cid:127)cient OneCoe(cid:127)cient Two0510150.512345Relative dot prod.\fFigure 7: sparse REFIRE network structure. Nodes are shown by a patch corresponding to its\nfeature vector. Arrows indicate connections, blue excitatory, red inhibitory. Plot organized to put\nstrongly connected nodes close in space. Only strongest connections shown in the interests of clarity.\nInset: Left: histogram of connectivity fraction by difference in feature orientation; red non-zero\nconnections, gray all connections. Right: zoomed in view.\n\nThe sparse REFIRE network model bears an important relation to the family of sparse subspace\nmodels, which have been suggested to improve the robustness of sparse representations[26, 27]. We\nhave shown that subspaces can be learned directly from the graph by standard graph partitioning\nalgorithms. The optimal way to leverage the information embodied in the sparse REFIRE network\nto learn subspace-like models is a subject of ongoing work with promising results as is the study of\ndifferent matrices L that allow persistent percepts.\n\nAcknowledgments\n\nWe would like to thank Anatoli Grinshpan, Tao Hu, Alexei Koulakov, Bruno Olshausen and Lav\nVarshney for fruitful discussions and Frank Midgley for assistance with preparing \ufb01gure 7.\n\nReferences\n\n[1] B. A. Olshausen and D. J. Field, \u201cEmergence of simple-cell receptive \ufb01eld properties by learn-\n\ning a sparse code for natural images,\u201d Nature, vol. 381, pp. 607\u20139, Jun 1996.\n\n[2] M. Rehn and F. Sommer, \u201cA network that uses few active neurones to code visual input predicts\nthe diverse shapes of cortical receptive \ufb01elds,\u201d Journal of Computational Neuroscience, vol. 22,\npp. 135\u2013146, 2007. 10.1007/s10827-006-0003-9.\n\n[3] P. J. Garrigues and B. A. Olshausen, \u201cLearning horizontal connections in a sparse coding model\nof natural images,\u201d Advances in Neural Information Processing Systems, vol. 20, pp. 505\u2013512,\n2008.\n\n8\n\n02040608000.020.040.06Fraction\f[4] O. Shriki, H. Sompolinsky, and D. D. Lee, \u201cAn information maximization approach to over-\ncomplete and recurrent representations,\u201d Advances in Neural Information Processing Systems,\nvol. 12, pp. 87\u201393, 2000.\n\n[5] D. B. Chklovskii and A. A. Koulakov, \u201cMaps in the brain: What can we learn from them?,\u201d\n\nAnnual Review of Neuroscience, vol. 27, no. 1, pp. 369\u2013392, 2004.\n\n[6] G. Major and D. Tank, \u201cPersistent neural activity: prevalence and mechanisms,\u201d Current opin-\n\nion in neurobiology, vol. 14, no. 6, pp. 675\u2013684, 2004.\n\n[7] M. Goldman, \u201cMemory without feedback in a neural network,\u201d Neuron, vol. 61, no. 4, pp. 621\u2013\n\n634, 2009.\n\n[8] A. Hyvarinen, J. Hurri, and P. O. Hoyer, Natural Image Statistics: A Probabilistic Approach to\n\nEarly Computational Vision. Springer Publishing Company, Incorporated, 2009.\n\n[9] O. Christensen, An Introduction to Frames and Riesz Bases. birkhauser, 2003.\n[10] S. Mallat and Z. Zhang, \u201cMatching pursuits with time-frequency dictionaries,\u201d Signal Process-\n\ning, IEEE Transactions on, vol. 41, pp. 3397 \u20133415, dec 1993.\n\n[11] S. Chen, D. Donoho, and M. Saunders, \u201cAtomic decomposition by basis pursuit,\u201d SIAM review,\n\nvol. 43, no. 1, pp. 129\u2013159, 2001.\n\n[12] R. Tibshirani, \u201cRegression shrinkage and selection via the lasso,\u201d Journal of the Royal Statis-\n\ntical Society (Series B), vol. 58, pp. 267\u2013288, 1996.\n\n[13] C. J. Rozell, D. H. Johnson, R. G. Baraniuk, and B. A. Olshausen, \u201cSparse coding via thresh-\nolding and local competition in neural circuits,\u201d Neural Comput, vol. 20, pp. 2526\u201363, 2008.\n[14] V. Braitenberg and A. Sch\u00a8uz, Cortex: Statistics and Geometry of Neuronal Connectivity.\n\nBerlin, Germany: Springer, 1998. ISBN: 3-540-63816-4.\n\n[15] S. Cannon, D. Robinson, and S. Shamma, \u201cA proposed neural network for the integrator of the\n\noculomotor system,\u201d Biological Cybernetics, vol. 49, no. 2, pp. 127\u2013136, 1983.\n\n[16] H. Seung, \u201cHow the brain keeps the eyes still,\u201d Proceedings of the National Academy of Sci-\n\nences, vol. 93, no. 23, p. 13339, 1996.\n\n[17] J. Kovavcevic and A. Chebira, \u201cAn introduction to frames,\u201d Found. Trends Signal Process.,\n\nvol. 2, no. 1, pp. 1\u201394, 2008.\n\n[18] Y. Mishchenko, T. Hu, J. Spacek, J. Mendenhall, K. M. Harris, and D. B. Chklovskii, \u201cUl-\ntrastructural analysis of hippocampal neuropil from the connectomics perspective,\u201d Neuron,\nvol. 67, no. 6, pp. 1009\u20131020, 2010.\n\n[19] L. R. Varshney, P. J. Sj\u00a8ostr\u00a8om, and D. B. Chklovskii, \u201cOptimal information storage in noisy\n\nsynapses under resource constraints,\u201d Neuron, vol. 52, no. 3, pp. 409 \u2013 423, 2006.\n\n[20] B. Cheng, J. Yang, S. Yan, Y. Fu, and T. Huang, \u201cLearning with L1-Graph for Image Analysis,\u201d\n\nIEEE Transactions on Image Processing, p. 1, 2010.\n\n[21] E. Elhamifar and R. Vidal, \u201cSparse subspace clustering,\u201d in CVPR, pp. 2790 \u20132797, 2009.\n[22] R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, \u201cProximal Methods for Sparse Hierarchical\n\nDictionary Learning,\u201d Proc. ICML, 2010.\n\n[23] D. Zoran and Y. Weiss, \u201cThe\u201d Tree-Dependent Components\u201d of Natural Images are Edge Fil-\n\nters,\u201d Advances in Neural Information Processing Systems, 2009.\n\n[24] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, \u201cOnline learning for matrix factorization and sparse\n\ncoding,\u201d Journal of Machine Learning Research, vol. 11, pp. 19\u201360, 2010.\n\n[25] S. Song, P. J. Sj\u00a8ostr\u00a8om, M. Reigl, S. Nelson, and D. B. Chklovskii, \u201cHighly nonrandom features\n\nof synaptic connectivity in local cortical circuits,\u201d PLoS Biol, vol. 3, p. e68, Mar 2005.\n\n[26] G. Yu, G. Sapiro, and S. Mallat, \u201cImage modeling and enhancement via structured sparse\n\nmodel selection,\u201d 2010.\n\n[27] K. Kavukcuoglu, M. Ranzato, R. Fergus, and Y. LeCun, \u201cLearning invariant features through\ntopographic \ufb01lter maps,\u201d in Proc. International Conference on Computer Vision and Pattern\nRecognition (CVPR\u201909), IEEE, 2009.\n\n9\n\n\f", "award": [], "sourceid": 270, "authors": [{"given_name": "Shaul", "family_name": "Druckmann", "institution": null}, {"given_name": "Dmitri", "family_name": "Chklovskii", "institution": null}]}