Model-Based Channel Compensation for CSR

Degradation of speech recognition performance under changing recording/transmission environments has become an increasingly important problem as recognition systems move from the lab to real users. Noise/channel compensation can be particularly difficult in applications where utterances are short and simple cepstral mean estimates are unreliable. To address this problem, our research explores new channel/noise compensation techniques for improved telephone speech recognition.

One project focused extended previous work in maximum likelihood channel estimation by introducing a prior distribution model of the channel/noise and using Bayesian estimation techniques and by assessing methods with training data compensation. Both cepstral filtering and model adaptation techniques are explored to assess computation-performance tradeoffs. A progressive recognition search allows for phone-conditioned channel estimation to improve compensation for short utterances (likely in many applications). Up to 10% error rate reduction is observed. The approach can easily be implemented in any multi-pass speech recognition system.

A second project involves development of speaker separation techniques explicitly aimed at improving automatic speech recognition, with a secondary goal of using recognition to improve the separation algorithm for improved perceptual quality.

(March 1994 - July 00)

SPONSORS: National Science Foundation and ARPA, NSF IRI-9408896


``Reducing the Effects of Linear Channel Distortion on Continuous Speech Recognition,'' R. Bates and M. Ostendorf, E Transactions on Speech and Audio Processing, appear.
Return to the SSLI Lab Projects Page.