Bayesian Agglomerative Clustering with Coalescents

Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)

Bibtex Metadata Paper

Authors

Yee Teh, Hal Daume III, Daniel M. Roy

Abstract

We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman’s coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over the state-of-the-art, and demonstrate our approach in document clustering and phylolinguistics.