Christopher Bishop, Markus Svensén, Christopher Williams
The Self-Organizing Map (SOM) algorithm has been extensively studied and has been applied with considerable success to a wide variety of problems. However, the algorithm is derived from heuris(cid:173) tic ideas and this leads to a number of significant limitations. In this paper, we consider the problem of modelling the probabil(cid:173) ity density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. We introduce a novel form of latent variable model, which we call the GTM algo(cid:173) rithm (for Generative Topographic Mapping), which allows general non-linear transformations from latent space to data space, and which is trained using the EM (expectation-maximization) algo(cid:173) rithm. Our approach overcomes the limitations of the SOM, while introducing no significant disadvantages. We demonstrate the per(cid:173) formance of the GTM algorithm on simulated data from flow diag(cid:173) nostics for a multi-phase oil pipeline.