Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)
David Lowe, Michael Tipping
Dimension-reducing feature extraction neural network techniques which also preserve neighbourhood relationships in data have tra(cid:173) ditionally been the exclusive domain of Kohonen self organising maps. Recently, we introduced a novel dimension-reducing feature extraction process, which is also topographic, based upon a Radial Basis Function architecture. It has been observed that the gener(cid:173) alisation performance of the system is broadly insensitive to model order complexity and other smoothing factors such as the kernel widths, contrary to intuition derived from supervised neural net(cid:173) work models. In this paper we provide an effective demonstration of this property and give a theoretical justification for the apparent 'self-regularising' behaviour of the 'NEUROSCALE' architecture.
'NeuroScale': A Feed-forward Neural Network Topographic Transformation
Recently an important class of topographic neural network based feature extraction approaches, which can be related to the traditional statistical methods of Sammon Mappings (Sammon, 1969) and Multidimensional Scaling (Kruskal, 1964), have been introduced (Mao and Jain, 1995; Lowe, 1993; Webb, 1995; Lowe and Tipping, 1996). These novel alternatives to Kohonen-like approaches for topographic feature extraction possess several interesting properties. For instance, the NEuROSCALE architecture has the empirically observed property that the generalisation perfor-
D. Lowe and M. E. Tipping
mance does not seem to depend critically on model order complexity, contrary to intuition based upon knowledge of its supervised counterparts. This paper presents evidence for their 'self-regularising' behaviour and provides an explanation in terms of the curvature of the trained models.
We now provide a brief introduction to the NEUROSCALE philosophy of nonlinear topographic feature extraction. Further details may be found in (Lowe, 1993; Lowe and Tipping, 1996). We seek a dimension-reducing, topographic transformation of data for the purposes of visualisation and analysis. By 'topographic', we imply that the geometric structure of the data be optimally preserved in the transformation, and the embodiment of this constraint is that the inter-point distances in the feature space should correspond as closely as possible to those distances in the data space. The implementation of this principle by a neural network is very simple. A Radial Basis Function (RBF) neural network is utilised to predict the coordinates of the data point in the transformed feature space. The locations of the feature points are indirectly determined by adjusting the weights of the network. The transformation is determined by optimising the network parameters in order to minimise a suitable error measure that embodies the topographic principle.
The specific details of this alternative approach are as follows. Given an m(cid:173) dimensional input space of N data points x q , an n-dimensional feature space of points Yq is generated such that the relative positions of the feature space points minimise the error, or 'STRESS', term:
E = 2: 2:(d~p - dqp)2,