Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)
Jian Peng, Liefeng Bo, Jinbo Xu
Conditional random fields (CRF) are quite successful on sequence labeling tasks such as natural language processing and biological sequence analysis. CRF models use linear potential functions to represent the relationship between input features and outputs. However, in many real-world applications such as protein structure prediction and handwriting recognition, the relationship between input features and outputs is highly complex and nonlinear, which cannot be accurately modeled by a linear function. To model the nonlinear relationship between input features and outputs we propose Conditional Neural Fields (CNF), a new conditional probabilistic graphical model for sequence labeling. Our CNF model extends CRF by adding one (or possibly several) middle layer between input features and outputs. The middle layer consists of a number of hidden parameterized gates, each acting as a local neural network node or feature extractor to capture the nonlinear relationship between input features and outputs. Therefore, conceptually this CNF model is much more expressive than the linear CRF model. To better control the complexity of the CNF model, we also present a hyperparameter optimization procedure within the evidence framework. Experiments on two widely-used benchmarks indicate that this CNF model performs significantly better than a number of popular methods. In particular, our CNF model is the best among about ten machine learning methods for protein secondary tructure prediction and also among a few of the best methods for handwriting recognition.