Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)
Francois Caron, Arnaud Doucet
Over recent years Dirichlet processes and the associated Chinese restaurant process (CRP) have found many applications in clustering while the Indian buffet process (IBP) is increasingly used to describe latent feature models. In the clustering case, we associate to each data point a latent allocation variable. These latent variables can share the same value and this induces a partition of the data set. The CRP is a prior distribution on such partitions. In latent feature models, we associate to each data point a potentially infinite number of binary latent variables indicating the possession of some features and the IBP is a prior distribution on the associated infinite binary matrix. These prior distributions are attractive because they ensure exchangeability (over samples). We propose here extensions of these models to decomposable graphs. These models have appealing properties and can be easily learned using Monte Carlo techniques.