Trevor Hastie, Patrice Simard
Simard, LeCun & Denker (1993) showed that the performance of nearest-neighbor classification schemes for handwritten character recognition can be improved by incorporating invariance to spe(cid:173) the so cific transformations in the underlying distance metric - called tangent distance. The resulting classifier, however, can be prohibitively slow and memory intensive due to the large amount of prototypes that need to be stored and used in the distance compar(cid:173) isons. In this paper we develop rich models for representing large subsets of the prototypes. These models are either used singly per class, or as basic building blocks in conjunction with the K-means clustering algorithm.
*This work was performed while Trevor Hastie was a member of the Statistics and Data
Analysis Research Group, AT&T Bell Laboratories, Murray Hill, NJ 07974.