Part of Advances in Neural Information Processing Systems 5 (NIPS 1992)
Pierre Baldi, Yves Chauvin, Tim Hunkapiller, Marcella McClure
Hidden Markov Models (HMMs) can be applied to several impor(cid:173) tant problems in molecular biology. We introduce a new convergent learning algorithm for HMMs that, unlike the classical Baum-Welch algorithm is smooth and can be applied on-line or in batch mode, with or without the usual Viterbi most likely path approximation. Left-right HMMs with insertion and deletion states are then trained to represent several protein families including immunoglobulins and kinases. In all cases, the models derived capture all the important statistical properties of the families and can be used efficiently in a number of important tasks such as multiple alignment, motif de(cid:173) tection, and classification.
*and Division of Biology, California Institute of Technology. t and Department of Psychology, Stanford University.