{"title": "An Information Theoretic Approach to the Functional Classification of Neurons", "book": "Advances in Neural Information Processing Systems", "page_first": 213, "page_last": 220, "abstract": null, "full_text": "An Information Theoretic Approach to the\n\nFunctional Classi\ufb01cation of Neurons\n\nElad Schneidman,1;2 William Bialek,1 and Michael J. Berry II2\n\n1Department of Physics and 2Department of Molecular Biology\n\nPrinceton University, Princeton NJ 08544, USA\n\nfelads,wbialek,berryg@princeton.edu\n\nAbstract\n\nA population of neurons typically exhibits a broad diversity of responses\nto sensory inputs. The intuitive notion of functional classi\ufb01cation is that\ncells can be clustered so that most of the diversity is captured by the iden-\ntity of the clusters rather than by individuals within clusters. We show\nhow this intuition can be made precise using information theory, with-\nout any need to introduce a metric on the space of stimuli or responses.\nApplied to the retinal ganglion cells of the salamander, this approach re-\ncovers classical results, but also provides clear evidence for subclasses\nbeyond those identi\ufb01ed previously. Further, we \ufb01nd that each of the gan-\nglion cells is functionally unique, and that even within the same subclass\nonly a few spikes are needed to reliably distinguish between cells.\n\n1\n\nIntroduction\n\nNeurons exhibit an enormous variety of shapes and molecular compositions. Already in his\nclassical work, Cajal [1] recognized that the shapes of cells can be classi\ufb01ed, and he iden-\nti\ufb01ed many of the cell types that we recognize today. Such classi\ufb01cation is fundamentally\nimportant, because it implies that instead of having to describe (cid:24)1012 individual neurons,\na mature neuroscience might need to deal only with a few thousand different classes of\nnominally identical neurons. There are three broad methods of classi\ufb01cation: morpholog-\nical, molecular, and functional. Morphological and molecular classi\ufb01cation are appealing\nbecause they deal with relatively \ufb01xed properties, but ultimately the functional properties\nof neurons are the most important, and neurons that share the same morphology or molec-\nular markers need not embody the same function. With attention to arbitrary detail, every\nneuron will be individual, while a coarser view might overlook an important distinction; a\nquantitative formulation of the classi\ufb01cation problem is essential.\n\nThe vertebrate retina is an attractive example: its anatomy is well studied and highly or-\ndered, containing repeated micro-circuits that look out at different angles in visual space\n[1, 2, 3]; its overall function (vision) is clear, giving the experimenter better intuition\nabout relevant stimuli; and responses of many of its output neurons, ganglion cells, can\nbe recorded simultaneously using a multi-electrode array, allowing greater control of ex-\nperimental variables than possible with serial recordings [4]. Here we exploit this favorable\nexperimental situation to highlight the mathematical questions that must lie behind any at-\ntempt at classi\ufb01cation.\n\n\fFunctional classi\ufb01cation of retinal ganglion cells typically has consisted of \ufb01nding qualita-\ntively different responses to simple stimuli. Classes are de\ufb01ned by whether ganglion cells\n\ufb01re spikes at the onset or offset of a step of light or both (ON, OFF, ON/OFF cells in frog\n[5]) or whether they \ufb01re once or twice per cycle of a drifting grating (X, Y cells in cat [6]).\nFurther elaborations exist. In the frog, the literature reports 1 class of ON-type ganglion\ncell and 4 or 5 classes of OFF-type [7]. The salamander has been reported to have only 3 of\nthese OFF-type ganglion cells [8]. The classes have been distinguished using stimuli such\nas diffuse \ufb02ashes of light, moving bars, and moving spots. The results are similar to earlier\nwork using more exotic stimuli [9]. In some cases, there is very close agreement between\nanatomical and functional classes, such as the ((cid:11),(cid:12)) and (Y,X) cells in the cat. However,\nthe link between anatomy and function is not always so clear.\n\nHere we show how information theory allows us to de\ufb01ne the problem of classi\ufb01cation\nwithout any a priori assumptions regarding which features of visual stimulus or neural\nresponse are most signi\ufb01cant, and without imposing a metric on these variables. All notions\nof similarity emerge from the joint statistics of neurons in a population as they respond to\ncommon stimuli. To the extent that we identify the function of retinal ganglion cells as\nproviding the brain with information about the visual world, then our approach \ufb01nds exactly\nthe classi\ufb01cation which captures this functionality in a maximally ef\ufb01cient manner. Applied\nto experiments on the tiger salamander retina, this method identi\ufb01es the major types of\nganglion cells in agreement with traditional methods, but on a \ufb01ner level we \ufb01nd clear\nstructure within a group of 19 fast OFF cells that suggests at least 5 functional subclasses.\nMore profoundly, even cells within a subclass are very different from one another, so that on\naverage the ganglion cell responses to the simpli\ufb01ed visual stimuli we have used provide\n(cid:24)6 bits/sec of information about cell identity within our population of 21 cells. This is\nsuf\ufb01cient to identify uniquely each neuron in an \u201celementary patch\u201d of the retina within one\nsecond, and a typical pair of cells can be distinguished reliably by observing an average of\njust two or three spikes.\n\n2 Theory\n\nSuppose that we could give a complete characterization, for each neuron i = 1; 2; (cid:1) (cid:1) (cid:1) ; N\nin a population, of the probability P (rj~s; i) that a stimulus ~s will generate the response\nr. Traditional approaches to functional classi\ufb01cation introduce (implicitly or explicitly) a\nparametric representation for the distributions P (rj~s; i) and then search for clusters in this\nparameter space. For visual neurons we might assume that responses are determined by the\nprojection of the stimulus movie ~s onto a single template or receptive \ufb01eld ~fi, P (rj~s; i) =\nF (r;~fi(cid:1)~s); classifying neurons then amounts to clustering the receptive \ufb01elds. But it is\nnot possible to cluster without specifying what it means for these vectors to be similar; in\nthis case, since the vectors come from the space of stimuli, we need a metric or distortion\nmeasure on the stimuli themselves. It seems strange that classifying the responses of visual\nneurons requires us to say in advance what it means for images or movies to be similar.1\nInformation theory suggests a formulation that does not require us to measure similarity\namong either stimuli or responses. Imagine that we present a stimulus ~s and record the\nresponse r from a single neuron in the population, but we don\u2019t know which one. This re-\nsponse tells us something about the identity of the cell, and on average this can be quanti\ufb01ed\n\n1If all cells are selective for a small number of commensurate features, then the set of vectors ~fi\nmust lie on a low dimensional manifold, and we can use this selectivity to guide the clustering. But\nwe still face the problem of de\ufb01ning similarity: even if all the receptive \ufb01elds in the retina can be\nsummarized meaningfully by the diameters of the center and surround (for example), why should we\nbelieve that Euclidean distance in this two dimensional space is a sensible metric?\n\n\fas the mutual information between responses and identity (conditional on the stimulus),\n\nP (rj~s; i) log2(cid:20) P (rj~s; i)\n\nP (rj~s) (cid:21) bits;\n\n(1)\n\nN\n\n1\nN\n\nI(r; ij~s) =\n\nXi=1 Xr\nwhere P (rj~s) = (1=N )PN\n\ni=1 P (rj~s; i). The mutual information I(r; ij~s) measures the\nextent to which different cells in the population produce reliably distinguishable responses\nto the same stimulus; from Shannon\u2019s classical arguments [10] this is the unique measure of\nthese correlations which is consistent with simple and plausible constraints. It is natural to\nask this question on average in an ensemble of stimuli P (~s) (ideally the natural ensemble),\n\nhI(r; ij~s)i~s =\n\n1\nN\n\nN\n\nXi=1\n\nZ [d~s]P (~s)P (rj~s; i) log2(cid:20) P (rj~s; i)\nP (rj~s) (cid:21) ;\n\n(2)\n\nhI(r; ij~s)i~s is invariant under all invertible transformations of r or ~s.\nBecause information is mutual, we also can think of hI(r; ij~s)i~s as the information that\ncellular identity provides about the responses we will record. But now it is clear what we\nmean by classifying the cells: If there are clear classes, then we can predict the responses\nto a stimulus just by knowing the class to which a neuron belongs rather than knowing its\nunique identity. Thus we should be able to \ufb01nd a mapping i ! C of cells into classes\nC = 1; 2; (cid:1) (cid:1) (cid:1) ; K such that hI(r; Cj~s)i~s is almost as large as hI(r; ij~s)i~s, despite the fact\nthat the number of classes K is much less than the number of cells N.\nOptimal classi\ufb01cations are those which use the K different class labels to capture as much\ninformation as possible about the stimulus-response relation, maximizing hI(r; Cj~s)i~s at\n\ufb01xed K. More generally we can consider soft classi\ufb01cations, described by probabilities\nP (Cji) of assigning each cell to a class, in which case we would like to capture as much\ninformation as possible about the stimulus-response relation while constraining the amount\nof information that class labels provide directly about identity, I(C; i). In this case our\noptimization problem becomes, with (cid:21) as a Lagrange multiplier,\n\nmax\nP (Cji)\n\n[hI(r; Cj~s)i~s (cid:0) (cid:21)I(C; i)] :\n\n(3)\n\nThis is a generalization of the information bottleneck problem [11].\n\nHere we con\ufb01ne ourselves to hard classi\ufb01cations, and use a greedy agglomerative algorithm\n[12] which starts with K = N and makes mergers which at every step provide the smallest\nreduction in I(r; Cj~s). This information loss on merging cells (or clusters) i and j is given\nby\n\nD(i; j) (cid:17) (cid:1)Iij(r; Cj~s) = hDJ S[P (rj~s; i)jjP (rj~s; j)]i~s;\n\n(4)\n\nwhere DJ S is the Jensen\u2013Shannon divergence [13] between the two distributions, or equiv-\nalently the information that one sample provides about its source distribution in the case\nof just these two alternatives. The matrix of \u201cdistances\u201d (cid:1)Iij characterizes the similarities\namong neurons in pairwise fashion.\n\nFinally, if cells belong to clear classes, then we ought to be able to replace each cell by a\ntypical or average member of the class without sacri\ufb01cing function. In this case function is\nquanti\ufb01ed by asking how much information cells provide about the visual scene. There is a\nstrict complementarity of the information measures: information that the stimulus/response\nrelation provides about the identity of the cell is exactly information about the visual scene\nwhich will be lost if we don\u2019t know the identity of the cells [14]. Our information theoretic\n\n\fapproach to classi\ufb01cation of neurons thus produces classes such that replacing cells with\naverage class members provides the smallest loss of information about the sensory inputs.\n\n3 The responses of retinal ganglion cells to identical stimuli\n\nWe recorded simultaneously 21 retinal ganglion cells from the salamander using a multi-\nelectrode array.2 The visual stimulus consisted of 100 repeats of a 20 s segment of spatially\nuniform \ufb02icker (see \ufb01g. 1a), in which light intensity values were randomly selected every\n30 ms from a Gaussian distribution having a mean of 4 mW/mm2 and an RMS contrast of\n18%. Thus, the photoreceptors were presented with exactly the same visual stimulus, and\nthe movie is many correlation times in duration, so we can replace averages over stimuli by\naverages over time (ergodicity). A 3 s sample of the ganglion cell\u2019s responses to the visual\nstimulus is shown in Fig. 1b. There are times when many of the cells \ufb01re together, while at\nother times only a subset of these cells is active. Importantly, the same neuron may be part\nof different active groups at different times.\n\na\n\nb\n\ntime\n\nc\n20\n\n)\ns\n/\ns\nt\ni\n\nb\n(\n \n\nt\n\ne\na\nr\n \n\nn\no\n\ni\nt\n\na\nm\nr\no\n\nf\n\nn\n\nI\n\n15\n\n10\n\n5\n\n0\n0\n\n)\ns\n/\ns\ne\nk\np\ns\n(\n \ne\n\ni\n\nt\n\na\nr\n \ng\nn\ni\nr\ni\nf\n\n10\n\n 5\n\n0\n\nd\n\n0.2\n\n0.1\n\n0\n\n-0.1\n\n-0.2\n\nt\ns\na\nr\nt\n\nn\no\nc\n \n\nn\na\ne\nm\n\n -300\n\n20\ncell rank order (by rate)\n\n10\n\n500 ms \n\n -200\n\n0\ntime relative to spike (ms)\n\n -100\n\nFigure 1: Responses of salamander ganglion cells to modulated uniform \ufb01eld intensity. a: The\nretina is presented with a series of uniform intensity \u201cimages\u201d. The intensity modulation is Gaussian\nwhite noise distributed. b: A 3 sec segment of the (concurrent) responses of 21 ganglion cells to\nrepeated presentation of the stimulus. The rasters are ordered from bottom to top according to the\naverage \ufb01ring rate of the neurons (over the whole movie). c: Firing rate and Information rates of the\ndifferent cells as a function of their rank, ordered by their \ufb01ring rate. d: The average stimulus pattern\npreceding a spike for each of the different cells. Traditionally, these would be classi\ufb01ed as 1 ON cell,\n1 slow-OFF cell and 19 fast-OFF cells.\n\nOn a \ufb01ner time scale than shown here, the latency of the responses of the single neurons\nand their spiking patterns differ across time. To analyze the responses of the different\n\n2The retina is isolated from the eye of the larval tiger salamander (Ambystoma tigrinum) and per-\nfused in Ringer\u2019s medium. Action potentials were measured extracellularly using a multi-electrode\narray [4], while light was projected from a computer monitor onto the photoreceptor layer. Because\nerroneously sorted spikes would strongly effect our results, we were very conservative in our identi-\n\ufb01cation of cleanly isolated cells.\n\n\fneurons, we discretize the spike trains into time bins of size (cid:1)t. We examine the response\nin windows of time having length T , so that an individual neural response r becomes a\nbinary \u2018word\u2019 W with T =(cid:1)t \u2018letters\u2019.3\nSince the cells in Fig. 1b are ordered according to their average \ufb01ring rate, it is clear that\nthere is no \u2018simple\u2019 grouping of the cells\u2019 responses with respect to this response parameter;\n\ufb01ring rates range continuously from 1 to 7 spikes per second (Fig. 1c). Similarly, the rate\nof information (estimated according to [15]) that the cells encode about the same stimulus\nalso ranges continuously from 3 to 20 bits/s. We estimate the average stimulus pattern\npreceding a spike for each of the cells, the spike triggered average (STA), shown in Fig. 1d.\nAccording to traditional classi\ufb01cation based on the STA, one of the cells is an ON cell, one\nis a slow OFF cells and 19 belong to the fast OFF class [16]. While it may be possible to\nseparate the 19 waveforms of the fast OFF cells into subgroups, this requires assumptions\nabout what stimulus features are important. Furthermore, there is no clear standard for\nending such subclassi\ufb01cation.\n\n4 Clustering of the ganglion cells responses into functional types\n\nTo classify these ganglion cells, we solved the information theoretic optimization problem\ndescribed above. Figure 2a shows the pairwise distances D(i; j) among the 21 cells, ordered\nby their average \ufb01ring rates; again, \ufb01ring rate alone does not cluster the cells. The result of\nthe greedy clustering of the cells is shown by a binary dendrogram in Fig. 2b.\n\na\n\n5\n\n10\n\n15\n\n20\n\nc\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0\n\nn\no\n\ni\nt\n\na\nm\nr\no\n\nf\n\nn\n\ni\n \n\nd\ne\nz\n\ni\nl\n\na\nm\nr\no\nn\n\ny\nt\ni\nt\n\nn\ne\nd\n\ni\n \nt\n\nu\no\nb\na\n\n \n\nbits/s\n4\n\n3\n\n2\n\n1\n\n0\n\nb\n2\n\n)\ns\n/\ns\nt\ni\n\nb\n(\n \n\ne\nc\nn\na\n\nt\ns\nd\n\ni\n\n1.5\n\n1\n\n0.5\n\n0\n\nd\n\n5\n\n10\n\n15\n\n20\n\n 13 14 15\n\n7 3\n\n2\n8 1216 201718 10 11 \n\n5 6 9\n\n 19 21 1 4\n\n5\n\n10\n\n15\n\n20\n\n1x10ms\n2x5ms\n5x2ms\n1x10ms nn\n\n5\nnumber of clusters\n\n10\n\n15\n\n20\n\n5\n\n10\n\n15\n\n20\n\nbits/s\n4\n\n3\n\n2\n\n1\n\n0\n\nFigure 2: Clustering ganglion cell responses. a: Average distances between the cells responses;\ncells are ordered by their average \ufb01ring rate. b: Dendrogram of cell clustering. Cell names corre-\nspond to their \ufb01ring rate rank. The height of a merge re\ufb02ects the distance between merged elements.\nc: The information that the cells\u2019 responses convey about the clusters in every stage of the clustering\nin (b), normalized to the total information that the responses convey about cell identity. Using differ-\nent response segment parameters or clustering method (e.g., nearest neighbor) result in very similar\nbehavior. d: reordering of the distance matrix in (a) according to the tree structure given in (b).\n\nThe greedy agglomerative approximation [12] starts from every cell as a single cluster.\nWe iteratively merge the clusters ci and cj which have the minimal value of D(ci; cj)\n\n3As any \ufb01xed choice of T and (cid:1)t is arbitrary, we explore a range of these parameters.\n\n\fand display this distance or information loss as the height of the merger in Fig. 2b. We\npool their spike trains together as the responses of the new cell class. We now re-estimate\nthe distances between clusters and repeat the procedure, until we get a single cluster that\ncontains all cells. Fig. 2c shows the compression in information achieved by each of the\nmergers: for each number of clusters, we plot the mutual information between the clusters\nand the responses, hI(r; Cj~s)i~s, normalized by the information that the response conveys\nabout the full set of cells, hI(r; ij~s)i~s. The clustering structure and the information curve in\nFig. 2c are robust (up to one cell difference in the \ufb01nal dendrogram) to changes in the word\nsize and bin size used; we even obtain the same results with a nearest neighbor clustering\nbased on D(i; j). This suggests that the top 7 mergers in Fig. 2b (which correspond to the\nbottom 7 points in panel c) are of signi\ufb01cantly different subgroups. Two of these mergers,\nwhich correspond to the rightmost branches of the dendrogram, separate out the ON and\nslow OFF cells. The remaining 5 clusters are subclasses of fast OFF cells. However, Fig. 2d\nwhich shows the dissimilarity matrix from panel a, reordered by the result of the clustering,\ndemonstrates that while there is clear structure within the cell population, the subclasses\nthere are not sharply distinct.\n\nHow many types are there?\n\nWhile one might be happy with classifying the fast OFF cells into 5 subclasses, we further\nasked whether the cells within a subclass are reliably distinguishable from one another;\nthat is, are the bottom mergers in Fig. 2b-c signi\ufb01cant? To this end we randomly split each\nof the 21 cells into 2 halves (of 50 repeats each), or \u2018siblings\u2019, and re-clustered. Figure\n3a shows the resulting dendrogram of this clustering, indicating that the cells are reliably\ndistinguishable from one another: The nearest neighbor of each new half\u2013cell is its own\nsibling, and (almost) all of the \ufb01rst layer mergers are of the corresponding siblings (the\nonly mismatch is of a sibling merging with a neighboring full cell and then with the other\nsibling). Figure 3b shows the very different cumulative probability distributions of pairwise\ndistances among the parent cells and that of the distances between siblings.\n\na\n\n)\ns\n/\ns\nt\ni\n\nb\n(\n \n\ne\nc\nn\na\nt\ns\nd\n\ni\n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n9\n\n2\n\n5 6\n\n 15\n\n 14\n 13\n\n7 3\n\n8 12 16 2017\n\n18\n\n 10 11 19 21 1\n\n4\n\nb\n\nn\no\n\ni\nt\n\nu\nb\ni\nr\nt\ns\nd\n\ni\n\n \n\ne\nv\ni\nt\n\nl\n\na\nu\nm\nu\nc\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n\"siblings\"\n\nall pairs\n\n0\n0.1\n\n.2\n\n.3\n\n.4 .5\n\n1\n\n2\n\n3\n\n4\n\naverage distance between cells (bits/s)\n\nFigure 3: Every cell is different from the others. a: Clustering of cell responses after randomly\nsplitting every cell into 2 \u201csiblings\u201d. The nearest neighbor of each of the new cells is its sibling and\n(except for one case) so is the \ufb01rst merge. From the second level upwards, the tree is identical to\nFig. 2b (up to symmetry of tree plotting). b: Cumulative distribution of pairwise distances between\ncells. The distances between siblings are easily discriminated from the continuous distribution of\nvalues of all the (real) cells.\n\nHow signi\ufb01cant are the differences between the cells?\n\nIt might be that cells are distinguishable, but only after observing their responses for very\nlong times. Since 1 bit is needed to reliably distinguish between a pair of cells, Fig. 3b\n\n\fshows that more than 90% of the pairs are reliably distinguishable within 2 seconds or less.\nThis result is especially striking given the low mean spike rate of these cells; clearly, at\ntimes where none of the cells is spiking, it is impossible to distinguish between them. To\nplace the information about identity on an absolute scale, we compare it to the entropy\nof the responses at each time, using 10 ms segments of the responses at each time during\nthe stimulus (Fig. 4a). Most of the points lie close to the origin, but many of them re\ufb02ect\ndiscrete times when the responses of the neurons are very different and hence highly in-\nformative about cell identity: under the conditions of our experiment, roughly 30% of the\nresponse variability among cells is informative about their identity.4 On average observing\na single neural response gives about 6 bits/s about the identity of the cells within this popu-\nlation. We also computed the average number of spikes per cell which we need to observe\nto distinguish reliably between cells i and j,\n\nnd(i; j) =\n\n1\n2 ( (cid:22)ri + (cid:22)rj)\nD(i; j)\n\n:\n\n(5)\n\nwhere (cid:22)ri is the average spike rate of cell i in the experiment. Figure 4b shows the cumula-\ntive probability distribution of the values of nd. Evidently, more than 80% of the pairs are\nreliably distinguishable after observing, on average, only 3 spikes from one of the neurons.\nSince ganglion cells \ufb01re in bursts, this suggest that most cells are reliably distinguishable\nbased on a single \ufb01ring \u2018event\u2019! We also show that for the 11 most similar cells (those in\nthe left subtree in Fig. 2b) only a few more spikes, or one extra \ufb01ring event, are required to\nreliably distinguish them.\n\na\n\n1\n\n \n\nn\n\ni\n \ny\nt\ni\nt\n\nn\ne\nd\n\ni\n \nt\n\nu\no\nb\na\nn\no\n\n \n\n)\ns\nt\ni\n\nb\n(\n \nt\n\nn\ne\nm\ng\ne\ns\n \ne\ns\nn\no\np\ns\ne\nr\n \ns\nm\n0\n1\n\n \n\ni\nt\n\na\nm\nr\no\nn\n\nf\n\ni\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0\n\nb\n\n1\n\n0.8\n\nn\no\n\ni\nt\n\nu\nb\ni\nr\nt\ns\nd\ne\nv\ni\nt\n\n \n\ni\n\nl\n\na\nu\nm\nu\nc\n\nall pairs \n\nsubtree pairs \n\n0.6\n\n0.4\n\n0.2\n\n0.5\n\n1\n\n1.5\n\n2\n\n2.5\n\nentropy of 10 ms response segments (bits)\n\n0\n\n0.5\n\n1\n\n2\n\n3\n\n4\n\n5\nnd (spikes)\n\n10\n\n20\n\n30\n\nFigure 4: High diversity among cells. a: The average information that a response segment conveys\nabout the identity of the cell as a function of the entropy of the responses. Every point stands for a\ntime point along the stimulus. Results shown are for 2-letter words of 5 ms bins; similar behavior is\nobserved for different word sizes and bins b: Cumulative distribution of the average number of spikes\nthat are needed to distinguish between pair of cells.\n\n5 Discussion\n\nWe have identi\ufb01ed a diversity of functional types of retinal ganglion cells by clustering\nthem to preserve information about their identity. Beyond the easy classi\ufb01cation of the ma-\njor types of salamander ganglion cells \u2013 fast OFF, slow OFF, and ON \u2013 in agreement with\ntraditional methods, we have found clear structure within the fast OFF cells that suggests\nat least 5 more functional classes. Furthermore, we found evidence that each cell is func-\ntionally unique. Even under this relatively simple stimulus, the analysis revealed that the\n\n4Since the cells receive the same stimulus and often possess shared circuitry, an ef\ufb01ciency as high\n\nas 100% is very unlikely.\n\n\fcell responses convey (cid:24)6 bits/s of information about cell identity within this population of\n21 cells. Ganglion cells in the salamander interact with each other and collect information\nfrom a (cid:24)250 (cid:22)m radius; given the density of ganglion cells, the observed rate implies that\na single ganglion cell can be discriminated from all the cells in this \u201celementary patch\u201d\nwithin 1 s. This is a surprising degree of diversity, given that 19 cells in our sample would\nbe traditionally viewed as nominally the same.\n\nOne might wonder if our choice of uniform \ufb02icker limits the results of our classi\ufb01cation.\nHowever, we found that this stimulus was rich enough to distinguish every ganglion cell in\nour data set. It is likely that stimuli with spatial structure would reveal further differences.\nUsing a larger collection of cells will enable us to explore the possibility that there is a\ncontinuum of unique functional units in the retina.\n\nHow might the brain make use of this diversity? Several alternatives are conceivable. By\ncomparing the spiking of closely related cells, it might be possible to achieve much \ufb01ner\ndiscrimination among stimuli that tend to activate both cells. Diversity also can improve\nthe robustness of retinal signalling: as the retina is constantly setting its adaptive state in\nresponse to statistics of the environment that it cannot estimate without some noise, main-\ntaining functional diversity can guard against adaptation that overshoots its optimum. Fi-\nnally, great functional diversity opens up additional possibilities for learning strategies, in\nwhich downstream neurons select the most useful of its inputs rather than merely summing\nover identical inputs to reduce their noise. The example of the invertebrate retina demon-\nstrates that nature can construct neural circuits with almost crystalline reproducibility from\nsynapse to synapse. This suggests that the extreme diversity found here in the vertebrate\nretina may not be the result of some inevitable sloppiness of neural development but rather\nas evolutionary selection of a different strategy for representing the visual world.\n\nReferences\n\n[1] Cajal, S.R., Histologie du systeme nerveux de l\u2019homme et des vertebres., Paris: Maloine (1911).\n[2] Dowling, J., The Retina: An Approachable Part of the Brain. Cambridge, MA: Belknap Press\n\n(1987).\n\n[3] Masland, R.H., Nat. Neurosci., 4: 877-886 (2001).\n[4] Meister, M., Pine, J. & Baylor, D.A., J. Neurosci. Methods. 51: 95-106 (1994).\n[5] Hartline, H.K., Am. J. Physiol., 121: 400-415 (1937).\n[6] Hochstein, S. & Shapley, R.M., J. Physiol., 262: 265-84 (1976).\n[7] Grosser, O.-J. & Grosser-Cornehls, U., in Frog Neurobiology, ed: R. Llinas, Precht, W.: 297-385,\n\nSpringer-Verlag: New York (1976).\n\n[8] Grosser-Cornehls, U. & Himstedt, W., Brain Behav. Evol. 7: 145-168 (1973).\n[9] Lettvin, J.Y., Maturana, H.R., McCulloch, W.S. & Pitts, W.H., Proc. I.R.E., 47: 1940-51 (1959).\n[10] Shannon, C. E. & Weaver W. Mathematical theory of communication Univ. of Illinois (1949).\n[11] Tishby, N., Pereira, F. & Bialek, W., in Proceedings of The 37th Allerton conference on com-\n\nmunication, control & computing, Univ. of Illinois (1999). see also arXiv: physics/0004057.\n\n[12] Slonim, N. & Tishby, N., NIPS 12, 617\u2013623 (2000).\n[13] Lin, J., IEEE IT, 37, 145\u2013151 (1991).\n[14] Schneidman, E., Brenner, N., Tishby N., de Ruyter van Steveninck, R. & Bialek, W. NIPS 13:\n\n159-165 (2001). see also arXiv: physics/0005043.\n\n[15] Strong, S.P., Koberle, R., de Ruyter van Steveninck, R. & Bialek, W., Phys. Rev. Lett. 80, 197\u2013\n\n200 (1998). see also arXiv: cond-mat/9603127.\n\n[16] Keat, J., Reinagel, P., Reid, R.C. & Meister, M., Neuron 30, 803-817 (2001).\n\n\f", "award": [], "sourceid": 2231, "authors": [{"given_name": "Elad", "family_name": "Schneidman", "institution": null}, {"given_name": "William", "family_name": "Bialek", "institution": null}, {"given_name": "Michael", "family_name": "Ii", "institution": null}]}