Part of Advances in Neural Information Processing Systems 16 (NIPS 2003)
William Campbell, Joseph Campbell, Douglas Reynolds, Douglas Jones, Timothy Leek
A recent area of signiﬁcant progress in speaker recognition is the use of high level features—idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with long term statistics of phone patterns, word patterns, etc. of an individual. We propose the use of support vector machines and term frequency analysis of phone se- quences to model a given speaker. To this end, we explore techniques for text categorization applied to the problem. We derive a new kernel based upon a linearization of likelihood ratio scoring. We introduce a new phone-based SVM speaker recognition approach that halves the er- ror rate of conventional phone-based approaches.