Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)
Christopher Williams, Felix Agakov, Stephen Felderhof
Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data. Be(cid:173) low we consider PoE models in which each expert is a Gaussian. Although the product of Gaussians is also a Gaussian, if each Gaus(cid:173) sian has a simple structure the product can have a richer structure. We examine (1) Products of Gaussian pancakes which give rise to probabilistic Minor Components Analysis, (2) products of I-factor PPCA models and (3) a products of experts construction for an AR(l) process.
Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data. In this paper we consider PoE models in which each expert is a Gaussian. It is easy to see that in this case the product model will also be Gaussian. However, if each Gaussian has a simple structure, the product can have a richer structure. Using Gaussian experts is attractive as it permits a thorough analysis of the product architecture, which can be difficult with other models, e.g. models defined over discrete random variables.
Below we examine three cases of the products of Gaussians construction: (1) Prod(cid:173) ucts of Gaussian pancakes (PoGP) which give rise to probabilistic Minor Compo(cid:173) nents Analysis (MCA), providing a complementary result to probabilistic Principal Components Analysis (PPCA) obtained by Tipping and Bishop (1999); (2) Prod(cid:173) ucts of I-factor PPCA models; (3) A products of experts construction for an AR(l) process.
Products of Gaussians
If each expert is a Gaussian pi(xI8i ) '" N(J1i' ( i), the resulting distribution of the product of m Gaussians may be expressed as
By completing the square in the exponent it may be easily shown that p(xI8) N(/1;E, (2:), where (E l = 2::1 (i l . To simplify the following derivations we will assume that pi(xI8i ) '" N(O, (i) and thus that p(xI8) '" N(O, (2:). J12: i ° can be
obtained by translation of the coordinate system.
1 Products of Gaussian Pancakes
A Gaussian "pancake" (GP) is a d-dimensional Gaussian, contracted in one dimen(cid:173) sion and elongated in the other d - 1 dimensions. In this section we show that the maximum likelihood solution for a product of Gaussian pancakes (PoGP) yields a probabilistic formulation of Minor Components Analysis (MCA).
1.1 Covariance Structure of a GP Expert
Consider a d-dimensional Gaussian whose probability contours are contracted in the direction w and equally elongated in mutually orthogonal directions VI , ... , vd-l.We call this a Gaussian pancake or GP. Its inverse covariance may be written as