Part of Advances in Neural Information Processing Systems 11 (NIPS 1998)
Balázs Kégl, Adam Krzyzak, Tamás Linder, Kenneth Zeger
Principal curves have been defined as "self consistent" smooth curves which pass through the "middle" of a d-dimensional probability distri(cid:173) bution or data cloud. Recently, we  have offered a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition made it possible to carry out a theoretical analysis of learning principal curves from training data. In this paper we propose a practical construction based on the new definition. Simulation results demonstrate that the new algorithm compares favorably with previous methods both in terms of performance and computational complexity.