A speaker adaptation method for non-native speech using learners’ native utterances for computer-assisted language learning systems
Autor: | Yuichi Ohkawa, Shozo Makino, Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito |
---|---|
Rok vydání: | 2009 |
Předmět: |
Linguistics and Language
Computer science Communication First language Speech recognition Acoustic model Pronunciation Speech processing Speaker recognition Language acquisition Language and Linguistics Computer Science Applications Speaker diarisation Modeling and Simulation Computer Vision and Pattern Recognition Adaptation (computer science) Software |
Zdroj: | Speech Communication. 51:875-882 |
ISSN: | 0167-6393 |
Popis: | In recent years, various CALL systems which can evaluate a learner's pronunciation using speech recognition technology have been proposed. In order to evaluate a learner's utterances and point out problems with higher accuracy, speaker adaptation is a promising technology. However, many learners who use the CALL system often have very poor speaking ability in the target language (L2), so conventional speaker adaptation methods have problems because they require the learners' correctly-pronounced L2 utterances for adaptation. In this paper, we propose two new types of speaker adaptation methods for the CALL system. The new methods only require the learners' utterances in their native language (L1) for adapting the acoustic model for L2. The first method is an algorithm to adapt acoustic models using a bilingual speaker's utterances. The speaker-independent acoustic models of L1 and L2 are adapted to the bilingual speaker once, then they are adapted to the learner again using the learner's L1 utterances. Using this method, we obtained about 5-point higher phoneme recognition accuracy than the baseline method. The second method is a training algorithm of a set of acoustic models based on speaker adaptive training. It can robustly train bilinguals' models using a few utterances in L1 and L2 uttered by bilingual speakers. Using this method, we obtained about 10-point higher phoneme recognition accuracy than the baseline method. |
Databáze: | OpenAIRE |
Externí odkaz: |