Large Vocablary Cnotinuous Speech Recognition Performance in Car Environments for Various Phoneme Models

Autor: Miichi Yamada, Katsumi Nishitani, Satoshi Nakamura, Kiyohiro Shikano
Jazyk: angličtina
Rok vydání: 2000
Předmět:
Popis: This paper describes the continuous speech recognition performance in the car environments. Especially various kinds of phoneme models are evaluated. Since the speech recognition performance considerably degrades in the noisy environments, we must cope with this problem in the car environments. There are two primary factors which cause the degradation of the recognition performance. One is the additive noises such as the background noises, and the other is the multiplicative distortion such as the reverberation in the car cabin which is emphasized by the distance between a speaker and a microphone. In this paper, the phoneme models which take the additive noises and the multiplicative distortion into account are trained from the simulated speech data. When the car engine is off, the best word recognition rate is 98.8% for the multiplicative distortion phoneme model which is trained with the speech data generated by the multiplicative distortion simulation. When the car is in the running condition, the best recognition rate is 97.2% for the phoneme model which considers the multiplicative distortion and the additive noises. These results show the effectiveness of the phoneme models which are trained from the simulated speech database in the car environments.
WESTPRAC VII 2000: the 7th West Pacific Regional Acoustics Conference, October 3-5, 2000, Kumamoto, Japan.
Databáze: OpenAIRE