Part of Advances in Neural Information Processing Systems 19 (NIPS 2006)
Mark Johnson, Thomas Griffiths, Sharon Goldwater
This paper introduces adaptor grammars, a class of probabilistic models of lan- guage that generalize probabilistic context-free grammars (PCFGs). Adaptor grammars augment the probabilistic rules of PCFGs with “adaptors” that can in- duce dependencies among successive uses. With a particular choice of adaptor, based on the Pitman-Yor process, nonparametric Bayesian models of language using Dirichlet processes and hierarchical Dirichlet processes can be written as simple grammars. We present a general-purpose inference algorithm for adaptor grammars, making it easy to define and use such models, and illustrate how several existing nonparametric Bayesian models can be expressed within this framework.