Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda
In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classi(cid:2)cation; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition perfor- mance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likeli- hood approach, the proposed model selection is valid in principle, even when there are insuf(cid:2)cient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.