{"title": "Optimal Sampling of Natural Images: A Design Principle for the Visual System", "book": "Advances in Neural Information Processing Systems", "page_first": 363, "page_last": 369, "abstract": null, "full_text": "Optimal Sampling of Natural Images: A Design \n\nPrinciple for the Visual System? \n\nWilliam Bialek, a,b Daniel L. Ruderman, a and A. Zee C \n\na Department of Physics, and \n\nDepartment of Molecular and Cell Biology \n\nUniversity of California at Berkeley \n\nBerkeley, California 94720 \n\nbNEC Research Institute \n\n4 Independence Way \n\nPrinceton, New Jersey 08540 \n\nCInstitute for Theoretical Physics \n\nUniversity of California at Santa Barbara \n\nSanta Barbara, California 93106 \n\nAbstract \n\nWe formulate the problem of optimizing the sampling of natural images \nusing an array of linear filters. Optimization of information capacity is \nconstrained by the noise levels of the individual channels and by a penalty \nfor the construction of long-range interconnections in the array. At low \nsignal-to-noise ratios the optimal filter characteristics correspond to bound \nstates of a Schrodinger equation in which the signal spectrum plays the \nrole of the potential. The resulting optimal filters are remarkably similar \nto those observed in the mammalian visual cortex and the retinal ganglion \ncells of lower vertebrates. The observed scale invariance of natural images \nplays an essential role in this construction. \n\n363 \n\n\f364 \n\nBialek, Ruderman, and Zee \n\n1 \n\nIntroduction \n\nUnder certain conditions the visual system is capable of performing extremely effi(cid:173)\ncient signal processing [I]. One ofthe major theoretical issues in neural computation \nis to understand how this efficiency is reached given the constraints imposed by the \nbiological hardware. Part of the problem [2] is simply to give an informative rep(cid:173)\nresentation of the visual world using a limited number of neurons, each of which \nhas a limited information capacity. The information capacity of the visual system \nis determined in part by the spatial transfer characteristics, or \"receptive fields,\" of \nthe individual cells. From a theoretical point of view we can ask if there exists an \noptimal choice for these receptive fields, a choice which maximizes the information \ntransfer through the system given the hardware constraints. We show that this \noptimization problem has a simple formulation which allows us to use the intuition \ndeveloped through the variational approach to quantum mechanics. \n\nIn general our approach leads to receptive fields which are quite unlike those ob(cid:173)\nserved for cells in the visual cortex. In particular orientation selectivity is not a \ngeneric prediction. The optimal filters, however, depend on the statistical proper(cid:173)\nties of the images we are trying to sample. Natural images have a symmetry -\nscale \ninvariance [4] - which saves the theory: The optimal receptive fields for sampling \nof natural images are indeed orientation selective and bear a striking resemblance \nto observed receptive field characteristics in the mammalian visual cortex as well as \nthe retinal ganglion of lower vertebrates. \n\n2 General Theoretical Formulation \n\nWe assume that images are defined by a scalar field .p(x) on a two dimensional \nsurface with coordinates x. This image is sampled by an array of cells whose \noutputs Yn are given by \n\n(I) \n\nwhere the cell is loacted at site X n , its spatial transfer function or receptive field is \ndefined by F, and TJ is an independent noise source at each sampling point. We will \nassume for simplicity that the noise source is Gaussian, with (TJ2) = (T2. Our task \nis to find the receptive field F which maximizes the information provided about \u00a2l \nby the set of outputs {Yn}' \nIf the field .p is itself chosen from a stationary Gaussian distribution then the infor(cid:173)\nmation carried by the {Yn } is given by [3] \n\n1= _1_Tr In [6 + ~ J d2k eik ,(xn-Xm)IF(k)12 S(k)] \n\n21n 2 \n\nnm \n\n(T2 \n\n(271'')2 \n\nwhere S(k) is the power spectrum of the signals, \n\nS(k) = J d2ye- ik.y (\u00a2l(x + y)\u00a2l(x)), \n\n, \n\n(2) \n\n(3) \n\nand F(k) = J d2xe- ik.x F(x) is the receptive field in momentum (Fourier) space. \n\n\fOptimal Sampling of Natural Images \n\n365 \n\nAt low signal-to-noise ratios (large 0-2 ) we have \n\nN J d2k \n\nI ~ 2ln 20-2 \n\n-\n\n(211\"p IF(k)1 S(k), \n\n2 \n\nwhere N is the total number of cells. \n\n(4) \n\n(5) \n\nTo make our definition of the noise level 0- meaningful we must constrain the total \n\"gain\" of the filters F. One simple approach is to normalize the functions F in the \nusual L2 sense, \n\n2 J d 2k \n\nd xF (x) = \n\nJ 2 \n\n(211\"p IF(k)1 = 1. \n\n2 \n\n-\n\nIf we imagine driving the system with spectrally white images, this condition fixes \nthe total signal power passing through the filter. \n\nEven with normalization, optimization of information capacity is still not well(cid:173)\nposed. To avoid pathologies we must constrain the scale of variations in k-space. \nThis makes sense biologically since we know that sharp features in k-space can \nbe achieved only by introducing long-range interactions in real space, and cells in \nthe visual system typically have rather local interconnections. We implement this \nconstraint by introducing a penalty proportional to the mean square spatial extent \nof the receptive field, \n\n(6) \n\nWith all the constraints we find that, at low signal to noise ratio, our optimization \nproblem becomes that of minimizing the functional \n\nwhere A is a Lagrange multiplier and a measures the strength of the locality con(cid:173)\nstraint. The optimal filters are then solutions of the variational equation, \n\n(7) \n\na 2-\n\n- '2 \\7 kF(k) - 2ln 20-2 S(k)F(k) = AF(k). \n\n1 \n\n-\n\n-\n\n(8) \n\nWe recognize this as the Schrodinger equation for a particle moving in k-space, in \nwhich the mass M = n? / a, the potential V (k) = - S(k) /21n 20-2 , and A is the \nenergy eigenvalue. Since we are interested in normalizable F, we are restricted to \nbound states, and the optimal filter is just the bound state wave function. \n\nThere are in general several optimal filters, corresponding to the different bound \nstates. Each of these filters gives the same value for the total cost function C[fr] \nand hence is equally \"good\" in this context. Thus each sampling point should be \nserved by a set of filters rather than just one. Indeed, in the visual cortex one finds \na given region of the visual field being sampled by many cells with different spatial \nfrequency and orientation selectivities. \n\n\f366 \n\nBialek, Ruderman, and Zee \n\n3 A Near-Fatal Flaw and its Resolution \n\nIf the signal spectra S(k) are isotropic, so that features appear at all orientations \nacross the visual field, all of the bound states of the corresponding Schrodinger \nequation are eigenstates of angular momentum. But real visual neurons have recep(cid:173)\ntive fields with a single optimal orientation, not the multiple optima expected if the \nfilters F correspond to angular momentum eigenstates. One would like to combine \ndifferent angular momentum eigenfunctions to generate filters which respond to lo(cid:173)\ncalized regions of orientation. In general, however, the different angular momenta \nare associated with different energy eigenvalues and hence it is impossible to form \nlinear combinations which are still solutions of the variational problem. \n\nWe can construct receptive fields which are localized in orientation if there is some \nextra symmetry or accidental degeneracy which allows the existence of equal-energy \nstates with different angular momenta. If we believe that real receptive fields are the \nsolutions of our variational problem, it must be the case that the signal spectrum \nS(k) for natural images possesses such a symmetry. \n\nRecently Field [4] has measured the power spectra of several natural scenes. As \none might expect from discussions of \"fractal\" landscapes, these spectra are scale \ninvariant, with S(k) = A/lkI 2. It is easy to see that the corresponding quantum \nmechanics problem is a bit sick -\nthe energy is not bounded from below. In the \npresent context, however, this sickness is a saving grace. The equivalent Schrodinger \nequation is \n\na 2-\n\n- 2\\71: F (k) - 2In2u2IkI2F(k) = AF(k). \n\n-\n\nA \n\n-\n\n(9) \n\nIf we take q = (y'2IAI/a)k, then for bound states (A < 0) we find \n\nB \n\n-\n\n-\n\n2 -\n\n\\7 qF(q) + IqI2F(q) = F(q), \n\n(10) \nwith B = A/ In 2u2 \u2022 Thus we see that the energy A can be scaled away; there \nis no quantization condition. We are free to choose any value of A, but for each \nsuch value there are several angular momentum states. Since they correspond to \nthe same energy, superpositions of these states are also solutions of the original \nvariational problem. The scale invariance of natural images is the symmetry we \nneed in order to form localized receptive fields. \n\n4 Predicting Receptive Fields \n\nTo solve Eq. (9) we find it easier to transform back to real space. The result is \n\n02F \n\nof \n\nr2(1 + r2) or2 + r(1 + 5r2) or + [r2(4 + B + 02/04>2) + 02 /04>2]F = 0, \n\n(11) \nwhere 4> is the angular variable and r = (y'a/2IADlxl. Angular momentum states \nFm .- eim4J have the asymptotic Fm(r \u00ab 1) .- r\u00b1m, Fm(r \u00bb 1) .- r),\u00b1(m), with \nA\u00b1(m) = -2\u00b1v'm2 - B. We see that for m2 < B the solutions are oscillatory func(cid:173)\ntions of r, since A has an imaginary part. For m 2 > B + 4 the solution can diverge \n\n\fOptimal Sampling of Natural Images \n\n367 \n\nas r becomes large, and in this case we must be careful to choose solutions which \nare regular both at the origin and at infinity if we are to maintain the constraint in \nEq. (5). Numerically we find that there are no such solutions; the functions which \nbehave as r+1ml near the origin diverge at large r if m 2 > B + 4. We conclude \nthat for a given value of B, which measures the signal-to-noise ratio, there exists a \nfinite set of angular momentum states; these states can then be superposed to give \nreceptive fields with localized angular sensitivity. \nIn fact all linear combinations of m-states are solutions to the variational problem \nat low signal to noise ratio, so the precise form of orientation tuning is not deter(cid:173)\nmined. If we continue our expansion of the information capacity in powers of the \nsignal-to-noise ratio we find terms which will select different linear combinations of \nthe m-states and hence determine the precise orientation selectivity. These higher(cid:173)\norder terms, however, involve multi-point correlation functions of the image. At the \nlowest SNR, corresponding to the first term in our expansion, we are sensitive only \nto the two-point function (power spectrum) of the signal ensemble, which carries \nno information about angular correlations. A truly predictive theory of orientation \ntuning must thus rest on measurements of angular correlations in natural images; \nas far as we know such measurements have not been reported. \n\nEven without knowing the details of the higher-order correlation functions we can \nmake some progress. To begin, it is clear that at very small B orientation selectivity \nis impossible since there are only m = 0 solutions. This is the limit of very low SNR, \nor equivalently very strong constraints on the locality of the receptive field (large \nQ above). The circularly symmetric receptive fields that one finds in this limit are \ncenter-surround in structure, with the surround becoming more prominent as the \nsignal-to-noise ratio is increased. These predictions are in qualitative accord with \nwhat one sees in the mammalian retina, which is indeed extremely local- receptive \nfield centers for foveal ganglion cells may consist of just a single cone photoreceptor. \nAs one proceeds to the the cortex the constraints of locality are weaker and orien(cid:173)\ntation selectivity becomes possible. Similarly in lower vertebrates there is a greater \nrange of lateral connectivity in the retina itself, and hence orientation selectivity is \npossible at the level of the ganglion cell. \n\nTo proceed further we have explored the types of receptive fields which can be \nproduced by superposing m-states at a given value of B. VVe consider for the \nmoment only even-symmetric receptive fields, so we add all terms in phase. One \nsuch receptive field is shown in Fig. 1, together with experimental results for a \nsimple cell in the primary visual cortex of monkeys [5]. It is clear that we can obtain \nreasonable correspondence between theory and experiment. Obviously we have \nmade no detailed \"fit\" to the data, and indeed we are just beginning a quantitative \ncomparison of theory with experiment. Much of the arbitrariness in the construction \nof Fig. 1 will be removed once we have control over the higher terms in the SNR \nexpansion, as described above. \n\nIt is interesting that, at low SNR, there is no preferred value for the length scale. \nThus the optimal system may choose to sample images at many different scales and \nat different scales in different regions of the image. The experimental variability in \nspatial frequency tuning from cell to cell may thus not represent biological sloppiness \nbut rather the fact that any peak spatial frequency constitutes an optimal filter in \nthe sense defined here. \n\n\f368 \n\nBialek, Ruderman, and Zee \n\nFigure 1: !-.'lodel (left) and monkey (right) receptive fields. Monkey RF is from \nreference (5 J. \n\n5 Discussion \n\nThe selectivity of cortical neurons for orientation and spatial frequency are among \nthe best known facts about the visual system. Not surprisingly there have been \nmany attempts to derive these features from some theoretical perspective. One ap(cid:173)\nproach is to argue that such selectivity provides a natural preprocessing stage for \nmore complex computations. A very different view is that the observed organization \nof the cortex is a consequence of developmental rules, but this approach does not \naddress the computational function which may be expressed by cortical organiza(cid:173)\ntion. Finally several authors have considered the possibility that cortical receptive \nfields are in some sense optimaL so that they can be predicted from a variational \nprinciple (6, 7. 8J. Clearly we have adopted this last hypothesis; the issue is whether \none can make a compelling argument for any particular variational principle. \n\nOptimization of information capacity seems like a very natural principle to apply in \nthe early stages of visual processing. As we have emphasized, this principle must be \nsupplemented by a knowledge of hardware constraints and of image statistics. Dif(cid:173)\nferent authors have made different choices. especially for the constraints. Different \noptimization of information transfer at \nformulations, however, may be related -\nsome fixed \"gain\" of the receptive fields is equivalent, through a Legendre transfor(cid:173)\nmation, to minimization of the redundancy at fixed information transfer, a problem \ndiscussed by Atick and Redlich (8]. This latter approach has given very successful \npredictions for the structure of ganglion cell receptive fields in cat and monkey, \nalthough there are still some arbitrary parameters to be determined. It is our hope \nthat these ideas of receptive fields as solutions to variational problems can be given \n\n\fOptimal Sampling of Natural Images \n\n369 \n\nmore detailed tests in the lower vertebrate retinas, where it is possible to charac(cid:173)\nterize signals and noise at each of three layers of processing cicuitry. \n\nAs far as we know our work is unique in that the statistics of natural images, is \nan essential component of the theory. Indeed the scale invariance of natural im(cid:173)\nages plays a decisive role in our prediction of orientation selectivity; other classes \nof signals would result in qualitatively different receptive fields. We find this direct \nlinkage between the properties of natural images and the architecture of natural \ncomputing systems to be extremely attractive. The semi-quantitative correspon(cid:173)\ndence between predicted and observed receptive fields (Fig. 1) suggests that we \nhave the kernel of a truly predictive theory for visual processing. \n\nAcknowledgements \n\nWe thank K. DeValois, R. DeValois, J. D. Jackson, and N. Socci for helpful discus(cid:173)\nsions. Work at Berkeley was supported in part by the National Science Foundation \nthrough a Presidential Young Investigator Award (to WB), supplemented by funds \nfrom Sun Microsystems and Cray Research, and by the Fannie and John Hertz \nFoundation through a graduate fellowship (to DLR). Work in Santa Barbara was \nsupported in part by the NSF through Grant No. PHY82-l7853, supplemented by \nfunds from NASA. \n\nReferences \n\n[1] W. Bialek. In E. Jen, editor, 1989 Lectures in Complex Systems, SFI Studies \nin the Sciences of Complexity, Lect. Vol. II, pages 513-595. Addison-Wesley, \nMenlo Park, CA, 1990. \n\n[2] H. B. Barlow. In W. A. Rosenblith, editor, Sensory Communication, page 217. \n\nMIT Press, Cambridge, MA, 1961. \n\n[3] C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. \n\nUniversity of Illinois Press, Urbana, IL, 1949. \n\n[4] D. Field. J. Opt. Soc. Am., 4:2379, 1987. \n[5] M. A. Webster and R. L. DeValois. J. Opt. Soc. Am., 2:1124-1132, 1985. \n[6] B. Sakitt and H. B. Barlow. Bioi. Cybern., 43:97-108, 1982. \n[7] R. Linsker. In D. Touretzky, editor, Advances in Neural Information Processing \n\n1, page 186. Morgan Kaufmann, San Mateo, CA, 1989. \n\n[8] J . J. Atick and A. N. Redlich. Neural Computation, 2:308, 1990 \n\n\f", "award": [], "sourceid": 375, "authors": [{"given_name": "William", "family_name": "Bialek", "institution": null}, {"given_name": "Daniel", "family_name": "Ruderman", "institution": null}, {"given_name": "A.", "family_name": "Zee", "institution": null}]}