Brendan J. Frey, Delbert Dueck
Clustering is a fundamental problem in machine learning and has been approached in many ways. Two general and quite different approaches include iteratively ﬁtting a mixture model (e.g., using EM) and linking to- gether pairs of training cases that have high afﬁnity (e.g., using spectral methods). Pair-wise clustering algorithms need not compute sufﬁcient statistics and avoid poor solutions by directly placing similar examples in the same cluster. However, many applications require that each cluster of data be accurately described by a prototype or model, so afﬁnity-based clustering – and its beneﬁts – cannot be directly realized. We describe a technique called “afﬁnity propagation”, which combines the advantages of both approaches. The method learns a mixture model of the data by recursively propagating afﬁnity messages. We demonstrate afﬁnity prop- agation on the problems of clustering image patches for image segmen- tation and learning mixtures of gene expression models from microar- ray data. We ﬁnd that afﬁnity propagation obtains better solutions than mixtures of Gaussians, the K-medoids algorithm, spectral clustering and hierarchical clustering, and is both able to ﬁnd a pre-speciﬁed number of clusters and is able to automatically determine the number of clusters. Interestingly, afﬁnity propagation can be viewed as belief propagation in a graphical model that accounts for pairwise training case likelihood functions and the identiﬁcation of cluster centers.