Improved Hidden Markov Model Speech Recognition Using Radial Basis Function Networks

Part of Advances in Neural Information Processing Systems 4 (NIPS 1991)

Bibtex Metadata Paper


Elliot Singer, Richard P. Lippmann


A high performance speaker-independent isolated-word hybrid speech rec(cid:173) ognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition ex(cid:173) periments using a speaker-independent E-set database, the hybrid rec(cid:173) ognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid system was based. These results and additional experiments demonstrate that RBF networks can be successfully incorporated in hybrid recognizers and sug(cid:173) gest that they may be capable of good performance with fewer parameters than required by Gaussian mixture classifiers. A global parameter opti(cid:173) mization method designed to minimize the overall word error rather than the frame recognition error failed to reduce the error rate.


A hybrid isolated-word speech recognizer was developed which combines neural network and Hidden Markov Model (HMM) approaches. The hybrid approach is an attempt to capitalize on the superior static pattern classification performance of neural network classifiers [6] while preserving the temporal alignment properties of HMM Viterbi decoding. Our approach is unique when compared to other studies [2, 5] in that we use Radial Basis Function (RBF) rather than multilayer sigmoidal networks. RBF networks were chosen because their static pattern classification performance is comparable to that of other networks and they can be trained rapidly using a one-pass matrix inversion technique [8].

The hybrid HMM/RBF isolated-word recognizer is shown in Figure 1. For each 159


Singer and Lippmann