Part of Advances in Neural Information Processing Systems 21 (NIPS 2008)
Shenghuo Zhu, Kai Yu, Yihong Gong
Stochastic relational models provide a rich family of choices for learning and predicting dyadic data between two sets of entities. It generalizes matrix factorization to a supervised learning problem that utilizes attributes of objects in a hierarchical Bayesian framework. Previously empirical Bayesian inference was applied, which is however not scalable when the size of either object sets becomes tens of thousands. In this paper, we introduce a Markov chain Monte Carlo (MCMC) algorithm to scale the model to very large-scale dyadic data. Both superior scalability and predictive accuracy are demonstrated on a collaborative filtering problem, which involves tens of thousands users and a half million items.