Odelia Schwartz, Terrence J. Sejnowski, Peter Dayan
In the analysis of natural images, Gaussian scale mixtures (GSM) have been used to account for the statistics of (cid:2)lter responses, and to inspire hi- erarchical cortical representational learning schemes. GSMs pose a crit- ical assignment problem, working out which (cid:2)lter responses were gen- erated by a common multiplicative factor. We present a new approach to solving this assignment problem through a probabilistic extension to the basic GSM, and show how to perform inference in the model using Gibbs sampling. We demonstrate the ef(cid:2)cacy of the approach on both synthetic and image data.
Understanding the statistical structure of natural images is an important goal for visual neuroscience. Neural representations in early cortical areas decompose images (and likely other sensory inputs) in a way that is sensitive to sophisticated aspects of their probabilistic structure. This structure also plays a key role in methods for image processing and coding. A striking aspect of natural images that has re(cid:3)ections in both top-down and bottom-up modeling is coordination across nearby locations, scales, and orientations. From a top- down perspective, this structure has been modeled using what is known as a Gaussian Scale Mixture model (GSM).1(cid:150)3 GSMs involve a multi-dimensional Gaussian (each di- mension of which captures local structure as in a linear (cid:2)lter), multiplied by a spatialized collection of common hidden scale variables or mixer variables(cid:3) (which capture the coordi- nation). GSMs have wide implications in theories of cortical receptive (cid:2)eld development, eg the comprehensive bubbles framework of Hyv¤arinen.4 The mixer variables provide the top-down account of two bottom-up characteristics of natural image statistics, namely the ‘bowtie’ statistical dependency,5,6 and the fact that the marginal distributions of receptive (cid:2)eld-like (cid:2)lters have high kurtosis.7,8 In hindsight, these ideas also bear a close relation- ship with Ruderman and Bialek’s multiplicative bottom-up image analysis framework9 and statistical models for divisive gain control.6 Coordinated structure has also been addressed in other image work,10(cid:150)14 and in other domains such as speech15 and (cid:2)nance.16 Many approaches to the unsupervised speci(cid:2)cation of representations in early cortical areas rely on the coordinated structure.17(cid:150)21 The idea is to learn linear (cid:2)lters (eg modeling simple cells as in22,23), and then, based on the coordination, to (cid:2)nd combinations of these (perhaps non-linearly transformed) as a way of (cid:2)nding higher order (cid:2)lters (eg complex cells). One critical facet whose speci(cid:2)cation from data is not obvious is the neighborhood arrangement, ie which linear (cid:2)lters share which mixer variables.
(cid:3)Mixer variables are also called mutlipliers, but are unrelated to the scales of a wavelet.
Here, we suggest a method for (cid:2)nding the neighborhood based on Bayesian inference of the GSM random variables. In section 1, we consider estimating these components based on information from different-sized neighborhoods and show the modes of failure when inference is too local or too global. Based on these observations, in section 2 we propose an extension to the GSM generative model, in which the mixer variables can overlap prob- abilistically. We solve the neighborhood assignment problem using Gibbs sampling, and demonstrate the technique on synthetic data. In section 3, we apply the technique to image data.
1 GSM inference of Gaussian and mixer variables
In a simple, n-dimensional, version of a GSM, (cid:2)lter responses l are synthesized y by mul- tiplying an n-dimensional Gaussian with values g = fg1 : : : gng, by a common mixer variable v. (1) We assume g are uncorrelated ((cid:27)2 along diagonal of the covariance matrix). For the ana- lytical calculations, we assume that v has a Rayleigh distribution: