(t (Xi ~~r) )2) =, F\u00a2 ( e( ')2) , \n\n(2) \n\nwhere e(k) = (c~k), ... , c};\u00bb) is the center of the tuning curve of neuron k, O'~k) is its \ntuning width in the i-th dimension, dk)2 := (Xi -\nc~k\u00bb)2/O'ik)2 for i = 1, ... ,D, and \n~(k)2 := ~~k)2 + ... + ~~)2. F > 0 denotes the maximal firing rate of the neurons, which \nrequires that maxz~o fj>(z) = 1. \n\nWe assume that the tuning widths O't), . .. ,O'~) of each neuron k are drawn from a distri(cid:173)\nbution PO' (0'1, ... ,O'D). For a population oftuning functions with centers e(l), ... , e(N), a \ndensity 1}(x) is introduced according to 1}(x) := L:~=l 8(x - e(k\u00bb). \nThe encoding accuracy can be quantified by the Fisher information matrix, J, which is \ndefined as \n\n(3) \n\nwhere E[ ... J denotes the expectation value over the probability distribution P(n; x) [2]. \nThe Fisher information yields a lower bound on the expected error of an unbiased estimator \nthat retrieves the stimulus x from the noisy neural activity (Cramer-Rao inequality) [2]. The \nminimal estimation error for the i-th feature Xi, ti,min, is given by t;,min = (J- 1 )ii which \nreduces to t;,min = 1/ Jii(X) if J is diagonal. \nWe shall now derive a general expression for the popUlation Fisher information. In the \nnext chapter, several cases and their consequences for neural encoding strategies will be \ndiscussed. \nFor model neuron (k), the Fisher information (3) reduces to \n\n(k) \n\n(k) _ \nJij (X'O'I \"\"'O'D) -\n\n(k) \n\n. \n\n1 \n(k) \n\n( (k)2 \n\n(k)Aq.. ~ \n\n) \n\n(k) (k) \n,F,T ~i ~j , \n\n(4) \n\nO'i O'j \n\n\fNeural Representation of Multi-Dimensional Stimuli \n\n117 \n\nwhere the dependence on the tuning widths is indicated by the list of arguments. The \nfunction A.p depends on the shape of the tuning function and is given in [13]. The in(cid:173)\ndependence assumption (1) implies that the population Fisher information is the sum of \nh \nt e contn utlOns 0 \n. ne now define \na population Fisher information which is averaged over the distribution of tuning widths \nPt:T(0\"1, . .. ,O\"D): \n\nt e III IVI ua neurons, L.Jk=1 \n\n. d\u00b7\u00b7d I \n\n, ... ,0\" D \n\n(k)) U7 \n\n\",N J(k)( \n\nij x; 0\"1 \n\n(k) \n\n\u00b7b\u00b7 \n\nf h \n\n(Jij (x)) 17 = L / d0\"1 . .. dO\"D Pt:T(0\"1,\u00b7 .. , O\"D) Ji~k) (x; 0\"1, \u00b7 .. , O\"D) . \n\nN \n\n(5) \n\nk= 1 \n\nIntroducing the density of tuning curves, 1J(x), into (5) and assuming a constant distri(cid:173)\nbution, 1J(x) == 1J == const., one obtains the result that the population Fisher information \nbecomes independentofx and that the off-diagonal elements of J vanish [13]. The average \npopulation Fisher information then becomes \n\n(Jij)t:T = 1J K.p F, r, D \n\nD \n\n( \n\n) / flt:l 0\"1) \n\\ \n\n0\"; \n\n~ \n\n17 Vij, \n\n(6) \n\nwhere K.p depends on the geometry of the tuning curves and is defined in [13]. \n\n3 Results \n\nIn this section, we consider different distributions of tuning widths in (6) and discuss ad(cid:173)\nvantageous and disadvantageous strategies for obtaining a high representational accuracy \nin the neural population. \n\nRadially symmetric tuning curves. For radially symmetric tuning curves of width a, \nthe tuning-width distribution reads \n\nPt:T(O\"l, .. . ,O\"D) = II O(O\"i -a); \n\nD \n\ni=l \n\nsee Fig. 1 a for a schematic visualization of the arrangement of the tuning widths for the \ncase D = 2. The average population Fisher information (6) for i = j becomes \n\n(Jii)t:T = 1JDK.p(F, r, D) aD - 2 , \n\n(7) \n\na result already obtained by Zhang and Sejnowski [13]. Equation (7) basically shows that \nthe minimal estimation error increases with a for D = 1, that it does not depend on a for \nD = 2, and that it decreases as a increases for D 2: 3. We shall discuss the relevance of \nthis case below. \n\nIdentical tuning curves without radial symmetry. Next we discuss tuning curves which \nare identical but not radially symmetric; the tuning-width distribution for this case is \n\nPt:T(0\"1, . .. ,O\"D) = II O(O\"i - ad, \n\nD \n\ni=l \n\nwhere ai denotes the fixed width in dimension i. For i = j, the average population Fisher \ninformation (6) reduces to [11,4] \n\n) \n(Jii)t:T = 1JDK.p F, r, D \n\n( \n\n. \n\n(8) \n\nflD -\n\n1=1 0\"1 \n-2 \nO\"i \n\n\f118 \n\nc. W. Eurich, S. D. Wilke and H. Schwegler \n\n(a) \n\n(b) \n\n/ \n\n(c) \n\nb \n\n_ \n\nb\n2\n\n. \n\n(d) \n\n. \n\n. \n\nFigure 1: Visualization of different distributions of \ntuning widths for D = 2. (a) Radially symmetric tun(cid:173)\ning curves. The dot indicates a fixed (j, while the diag(cid:173)\nonalline symbolizes a variation in (j discussed in [13]. \n(b) Identical tuning curves which are not radially sym(cid:173)\nmetric. (c) Tuning widths uniformly distributed within \n(d) Two sUbpopulations each of \na small rectangle. \nwhich is narrowly tuned in one dimension and broadly \ntuned in the other direction. \n\nEquation (8) contains (7) as a special case. From (8) it becomes immediately clear that the \nexpected minimal square encoding error for the i-th stimulus feature, \u20ac~ min = 1/ (Jii(X))u, \ndepends on i, i. e., the population specializes in certain features. The error obtained in \ndimension i thereby depends on the tuning widths in all dimensions. \n\nWhich encoding strategy is optimal for a population whose task it is to encode a single \nfeature, say feature i, with high accuracy while not caring about the other dimensions? In \norder to answer this question, we re-write (8) in terms of receptive field overlap. \n\nFor the tuning functions f(k) (x) encountered empirically, large values ofthe single-neuron \nFisher information (4) are typically restricted to a region around the center of the tuning \nfunction, c(k). The fraction p({3) of the Fisher information that falls into a region ED \nJ~(k)2 ~ (3 aroundc(k) is given by \n\np({3) := E; dD 2:~ J~~) ( ) \n\nt=l u \n\nX \n\nX \n\nf D \n\n(k) ( \nd X L....i=l Jii X \n\n\"\",D \n\n) \n\nX \n\nj3 f d~ ~D+l At/>(e, F, T) \no \n00 f d~ ~D+l At/>(~2, F, T) \no \n\n(9) \n\nwhere the index (k) was dropped because the tuning curves are assumed to have iden(cid:173)\ntical shapes. Equation (9) allows the definition of an effective receptive field, RF~~, \ninside of which neuron k conveys a major fraction Po of Fisher information, RF~~ := \n{xl~ ~ {3o} , where (3o is chosen such that p({3o) = Po. The Fisher information a \nneuron k carries is small unless x E RF~~. This has the consequence that a fixed stimulus \nx is actually encoded only by a subpopulation of neurons. The point x in stimulus space is \ncovered by \n\nNcode:= 1] Dr(D/2) }1 (Jj \n\n27rD/ 2({30)D D _ \n\n(10) \n\nreceptive fields. With the help of (10), the average population Fisher information (8) can \nbe re-written as \n\n(11) \n\nEquation (11) can be interpreted as follows: We assume that the population of neurons \nencodes stimulus dimension i accurately, while all other dimensions are of secondary im(cid:173)\nportance. The average population Fisher information for dimension i, (Jii ) u, is determined \nby the tuning width in dimension i, (ji, and by the size of the active subpopulation, Ncode ' \nThere is a tradeoff between these quantities. On the one hand, the encoding error can be \ndecreased by decreasing (ji, which enhances the Fisher information carried by each single \n\n\fNeural Representation of Multi-Dimensional Stimuli \n\n119 \n\nneuron. Decreasing ai, on the other hand, will also shrink the active subpopulation via \n(10). This impairs the encoding accuracy, because the stimulus position is evaluated from \nthe activity of fewer neurons. If (11) is valid due to a sufficient receptive field overlap, \nNcode can be increased by increasing the tuning widths, aj, in all other dimensions j i- i. \nThis effect is illustrated in Fig. 2 for D = 2. \n\nX2 \n\nc=:> \n\nx2, s \n\nX2 \n\nII\"\\.. \n, \n\n\\ U \n\nx2,s \n\nFigure 2: Encoding strategy for a stimulus characterized by parameters Xl,s and X2,s' Fea(cid:173)\nture Xl is to be encoded accurately. Effective receptive field shapes are indicated for both \npopulations. If neurons are narrowly tuned in X2 (left), the active population (solid) is \nsmall (here: Ncode = 3). Broadly tuned receptive fields for X2 (right) yield a much larger \npopulation (here: Ncode = 27) thus increasing the encoding accuracy. \n\nIt shall be noted that although a narrow tuning width ai is advantageous, the limit ai ---t 0 \nyields a bad representation. For narrowly tuned cells, gaps appear between the receptive \nfields: The condition 17(X) == const. breaks down, and (6) is no longer valid. A more \ndetailed calculation shows that the encoding error diverges as ai --* 0 [4]. The fact that \nthe encoding error decreases for both narrow tuning and broad tuning - due to (11) - proves \nthe existence of an optimal tuning width, An example is given in Fig. 3a. \n\n3 rTI~--~------~----~------~ \n\n(b) \n\n~~~~;::~-:.~~;: \n\n----- ---- ------- ---\n\n1\\ \nI i \n\n1\\ Ii I I \n\nI I \nI ; \n1\\ \n, \nI\n\n0.8 \n\n;to.6 \n~ \nN~O.4 \nw v \n\nA \n\n0.2 \n\n2 \n\nO'----~--~--~-----'-------' \n2 \n\n0.5 \n\n1.5 \n\no \n\n1 \nA \n\nFigure 3: (a) Example for the encoding behavior with narrow tuning curves arranged on \na regular lattice of dimension D = 1 (grid spacing ~). Tuning curves are Gaussian, and \nneural firing is modeled as a Poisson process, Dots indicate the minimal square encoding \nerror averaged over a uniform distribution of stimuli, (E~in)' as a function ofa. The mini(cid:173)\nmum is clearly visible. The dotted line shows the corresponding approximation according \nto (8). The inset shows Gaussian tuning curves of optimal width, aopt ~ 0.4~. (b) 9D()..) \nas a function of ).. for different values of D. \n\n\f120 \n\nc. W. Eurich, S. D. Wilke and H. Schwegler \n\nNarrow distribution of tuning curves. \nIn order to study the effects of encoding the \nstimulus with distributed tuning widths instead of identical tuning widths as in the previous \ncases, we now consider the distribution \n\nPu(lT1,'\" ,lTD) = g :i e [lTi - (O'i - i)] e [(O'i + i) -lTi] , \n\nD \n\n(12) \n\nwhere e denotes the Heaviside step function. Equation (12) describes a uniform distri(cid:173)\nbution in a D-dimensional cuboid of size b1, ... , b D around (0'1, .. . 0' D); cf. Fig. 1 c. A \nstraightforward calculation shows that in this case, the average population Fisher informa(cid:173)\ntion (6) for i = j becomes \n\n(Jii)u = f/DKtj) F, T, D) \n\n( \n\nn~l 0'1 { \n\nO'~ \n\n1 (bi ) 2 [( bi ) 4] } \n\n1 + 12 O'i + 0 \n\nO'i \n\n. \n\n(13) \n\nA comparison with (8) yields the astonishing result that an increase in bi results in an \nincrease in the i-th diagonal element of the average population Fisher information matrix \nand thus in an improvement in the encoding of the i-th stimulus feature, while the encoding \n:f. i is not affected. Correspondingly, the total encoding error can be \nin dimensions j \ndecreased by increasing an arbitrary number of edge lengths of the cube. The encoding by \na population with a variability in the tuning curve geometries as described is more precise \nthan that by a uniform population. This is true/or arbitrary D. Zhang and Sejnowski [13] \nconsider the more artificial situation of a correlated variability ofthe tuning widths: tuning \ncurves are always assumed to be radially symmetric. This is indicated by the diagonal \nline in Fig. 1 a. A distribution of tuning widths restricted to this subset yields an average \npopulation Fisher information ex: (O'D-2) and does not improve the encoding for D = 2 or \nD=3. \n\nFragmentation into D subpopulations. Finally, we study a family of distributions of \ntuning widths which also yields a lower minimal encoding error than the uniform popula(cid:173)\ntion. Let the density of tuning curves be given by \n\nPu(lT1,'\" ,lTD) = D L 6(lTi - AO') II 6(lTj - 0'), \n\n1 D \n\ni=l \n\nj\u00a5-i \n\n(14) \n\nwhere A > O. For A = 1, the population is uniform as in (7). For A :f. 1, the population \nis split up into D subpopulations; in subpopulation i, lTi is modified while lTj == 0' for \nj :f. i. See Fig. Id for an example. The diagonal elements ofthe average population Fisher \ninformation are \n\n(Jii)u = f/DKtj)(F, T, D) IT \n\n-D-2 {1 + (D - I)A2 } \n\nDA \n\n' \n\n(15) \n\nwhere the term in brackets will be abbreviated as 9D(A). (Jii)u does not depend on i in \nthis case because of the symmetry in the sUbpopulations. Equation (15) and the uniform \ncase (7) differ by 9D(A) which will now be discussed. Figure 3b shows 9D(A) for different \nvalues of D. For A = 1, 9D(A) = 1 and (7) is recovered as expected. 9D(A) = 1 \nalso holds for A = 1/ (D - 1) < 1: narrowing one tuning width in each subpopulation \nwill at first decrease the resolution provided D 2: 3; this is due to the fact that Ncode is \ndecreased. For A < 1/(D - 1), however, 9D(A) > 1, and the resolution exceeds (Jii)u in \n(7) because each neuron in the i-th subpopulation carries a high Fisher information in the \ni-th dimension. D = 2 is a special case where no impairment of encoding occurs because \nthe effect of a decrease of Ncode is less pronounced. Interestingly, an increase in A also \nyields an improvement in the encoding accuracy. This is a combined effect resulting from \nan increase in Ncode on the one hand and the existence of D subpopulations, D - 1 of \n\n\fNeural Representation of Multi-Dimensional Stimuli \n\n121 \n\nwhich maintain their tuning widths in each dimension on the other hand. The discussion \nof 9D(>\") leads to the following encoding strategy. For small >.., (Jii)u increases rapidly, \nwhich suggests a fragmentation of the population into D subpopulations each of which \nencodes one feature with high accuracy, i.e., one tuning width in each subpopulation is \nsmall whereas the remaining tuning widths are broad. Like in the case discussed above, the \ntheoretical limit of this method is a breakdown of the approximation of TJ == const. and the \nvalidity of (6) due to insufficient receptive field overlap. \n\n4 Discussion and Outlook \n\nWe have discussed the effects of a variation of the tuning widths on the encoding accuracy \nobtained by a population of stochastically spiking neurons. The question of an optimal \ntuning strategy has turned out to be more complicated than previously assumed. More \nspecifically, the case which focused most attention in the literature - radially symmetric \nreceptive fields [5, 1,9, 3, 13] - yields a worse encoding accuracy than most other cases we \nhave studied: uniform populations with tuning curves which are not radially symmetric; \ndistributions of tuning curves around some symmetric or non-symmetric tuning curve; and \nthe fragmentation of the population into D subpopulations each of which is specialized in \none stimulus feature. \nIn a next step, the theoretical results will be compared to empirical data on encoding prop(cid:173)\nerties of neural popUlations. One aspect is the existence of sensory maps which consist \nof neural subpopulations with characteristic tuning properties for the features which are \nrepresented. For example, receptive fields of auditory neurons in the midbrain of the barn \nowl have elongated shapes [6]. A second aspect concerns the short-term dynamics of re(cid:173)\nceptive fields. Using single-unit recordings in anaesthetized cats, Worgotter et al. [12] \nobserved changes in receptive field size taking place in 50-lOOms. Our findings suggest \nthat these dynamics alter the resolution obtained for the corresponding stimulus features. \nThe observed effect may therefore realize a mechanism of an adaptable selective signal \nprocessing. \n\nReferences \n[1] Baldi, P. & HeiJigenberg, W. (1988) BioI. Cybern. 59:313-318. \n[2] Deco, G. & Obradovic, D. (1997) An Information-Theoretic Approach to Neural Computing. \n\nNew York: Springer. \n\n[3] Eurich, C. W. & Schwegler, H. (1997) BioI. Cybern. 76: 357-363. \n[4] Eurich, C. W. & Wilke, S. D. (2000) NeuraL Compo (in press). \n[5] Hinton, G. E., McClelland, J. L. & Rumelhart, D. E (1986) In Rumelhart, D. E. & McClelland, \n\nJ. L. (eds.), ParaLLeL Distributed Processing, Vol. 1, pp. 77-109. Cambridge MA: MIT Press. \n\n[6] Knudsen, E. I. & Konishi, M. (1978) Science 200:795-797. \n[7] Kuffter, S. W. (1953) 1. Neurophysiol. 16:37-68. \n[8] Lettvin, J. Y., Maturana, H. R., McCulloch, W. S. & Pitts, W. H. (1959) Proc. Inst. Radio Eng. \n\nNY 47:1940-1951. \n\n[9] Snippe, H. P. & Koenderink, J. J. (1992) BioI. Cybern. 66:543-551. \n[10] Wiggers, W., Roth, G., Eurich, C. W. & Straub, A. (1995) J. Camp. Physiol. A 176:365-377. \n[11] Wilke, S. D. & Eurich, C. W. (1999) In Verleysen, M. (ed.), ESANN 99, European Symposium \n\non Artificial Neural Networks, pp. 435-440. Brussels: D-Facto. \n\n[12] Worgotter, F., Suder, K., Zhao, Y., Kerscher, N., Eysel, U. T. & Funke, K. (1998) Nature \n\n396:165-168. \n\n[13] Zhang, K. & Sejnowski, T. J. (1999) NeuraL Compo 11:75-84. \n\n\f", "award": [], "sourceid": 1760, "authors": [{"given_name": "Christian", "family_name": "Eurich", "institution": null}, {"given_name": "Stefan", "family_name": "Wilke", "institution": null}, {"given_name": "Helmut", "family_name": "Schwegler", "institution": null}]}