Part of Advances in Neural Information Processing Systems 19 (NIPS 2006)
Siwei Lyu, Eero Simoncelli
The local statistical properties of photographic images, when represented in a multi-scale basis, have been described using Gaussian scale mixtures (GSMs). Here, we use this local description to construct a global field of Gaussian scale mixtures (FoGSM). Specifically, we model subbands of wavelet coefficients as a product of an exponentiated homogeneous Gaussian Markov random field (hGMRF) and a second independent hGMRF. We show that parameter estimation for FoGSM is feasible, and that samples drawn from an estimated FoGSM model have marginal and joint statistics similar to wavelet coefficients of photographic images. We develop an algorithm for image denoising based on the FoGSM model, and demonstrate substantial improvements over current state-ofthe-art denoising method based on the local GSM model. Many successful methods in image processing and computer vision rely on statistical models for images, and it is thus of continuing interest to develop improved models, both in terms of their ability to precisely capture image structures, and in terms of their tractability when used in applications. Constructing such a model is difficult, primarily because of the intrinsic high dimensionality of the space of images. Two simplifying assumptions are usually made to reduce model complexity. The first is Markovianity: the density of a pixel conditioned on a small neighborhood, is assumed to be independent from the rest of the image. The second assumption is homogeneity: the local density is assumed to be independent of its absolute position within the image. The set of models satisfying both of these assumptions constitute the class of homogeneous Markov random fields (hMRFs). Over the past two decades, studies of photographic images represented with multi-scale multiorientation image decompositions (loosely referred to as "wavelets") have revealed striking nonGaussian regularities and inter and intra-subband dependencies. For instance, wavelet coefficients generally have highly kurtotic marginal distributions [1, 2], and their amplitudes exhibit strong correlations with the amplitudes of nearby coefficients [3, 4]. One model that can capture the nonGaussian marginal behaviors is a product of non-Gaussian scalar variables [5]. A number of authors have developed non-Gaussian MRF models based on this sort of local description [6, 7, 8], among which the recently developed fields of experts model [7] has demonstrated impressive performance in denoising (albeit at an extremely high computational cost in learning model parameters). An alternative model that can capture non-Gaussian local structure is a scale mixture model [9, 10, 11]. An important special case is Gaussian scale mixtures (GSM), which consists of a Gaussian random vector whose amplitude is modulated by a hidden scaling variable. The GSM model provides a particularly good description of local image statistics, and the Gaussian substructure of the model leads to efficient algorithms for parameter estimation and inference. Local GSM-based methods represent the current state-of-the-art in image denoising [12]. The power of GSM models should be substantially improved when extended to describe more than a small neighborhood of wavelet coefficients. To this end, several authors have embedded local Gaussian mixtures into tree-structured
MRF models [e.g., 13, 14]. In order to maintain tractability, these models are arranged such that coefficients are grouped in non-overlapping clusters, allowing a graphical probability model with no loops. Despite their global consistency, the artificially imposed cluster boundaries lead to substantial artifacts in applications such as denoising. In this paper, we use a local GSM as a basis for a globally consistent and spatially homogeneous field of Gaussian scale mixtures (FoGSM). Specifically, the FoGSM is formulated as the product of two mutually independent MRFs: a positive multiplier field obtained by exponentiating a homogeneous Gaussian MRF (hGMRF), and a second hGMRF. We develop a parameter estimation procedure, and show that the model is able to capture important statistical regularities in the marginal and joint wavelet statistics of a photographic image. We apply the FoGSM to image denoising, demonstrating substantial improvement over the previous state-of-the-art results obtained with a local GSM model.
1 Gaussian scale mixtures A GSM random vector x is formed as the product of a zero-mean Gaussian random vector u and an d d independent random variable z, as x = zu, where = denotes equality in distribution. The density of x is determined by the covariance of the Gaussian vector, , and the density of the multiplier, p z (z), through the integral - T -1 p z z xx 1 exp (1) p(x) = Nx (0, z) pz (z)dz z (z)d z. 2z z|| A key property of GSMs is that when z determines the scale of the conditional variance of x given z, wich is a Gaussian variable with zero mean and covariance z. In addition, the normalized variable h x z is a zero mean Gaussian with covariance matrix . The GSM model has been used to describe the marginal and joint densities of local clusters of wavelet coefficients, both within and across subbands [9], where the embedded Gaussian structure affords simple and efficient computation. This local GSM model has been be used for denoising, by independently estimating each coefficient conditioned on its surrounding cluster [12]. This method achieves state-of-the-art performances, despite the fact that treating overlapping clusters as independent does not give rise to a globally consistent statistical model that satisfies all the local constraints.
2 Fields of Gaussian scale mixtures In this section, we develop fields of Gaussian scale mixtures (FoGSM) as a framework for modeling wavelet coefficients of photographic images. Analogous to the local GSM model, we use a latent multiplier field to modulate a homogeneous Gaussian MRF (hGMRF). Formally, we define a FoGSM x as the product of two mutually independent MRFs, d x = u z, (2) where u is a zero-mean hGMRF, and z is a field of positive multipliers that control the local coefficient variances. The operator denotes element-wise multiplication, and the square root operation is applied to each component. Note that x has a one-dimensional GSM marginal distributions, while its components have dependencies captured by the MRF structures of u and z. Analogous to the local GSM, when conditioned on z, x is an inhomogeneous GMRF | | - - -1 x = 1 T D -1 1 Qu | Qu | p(x|z) x (x z)T Qu (x i exp z Qu D z i exp zi 2 zi 2 , z)
(3) where Qu is the inverse covariance matrix of u (also known as the precision matrix), and D() denotes the operator that form a diagonal matrix from an input vector. Note also that the elementwise division of the two fields, x z, yields a hGMRF with precision matrix Q u . To complete the FoGSM model, we need to specify the structure of the multiplier field z. For tractability, we use another hGMRF as a substrate, and map it into positive values by exponentiation,