On Semi-Supervised LF-MMI Training of Acoustic Models with Limited Data
Autor: | Irina Illina, Imran Sheikh, Emmanuel Vincent |
---|---|
Přispěvatelé: | Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), COMPRISE, Grid'5000, European Project: 825081,H2020,COMPRISE(2018), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL) |
Rok vydání: | 2020 |
Předmět: |
semi-supervised training
Computer science business.industry Detector speech recognition 020206 networking & telecommunications Pattern recognition 02 engineering and technology Mutual information 01 natural sciences [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] lattice-free MMI [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] Transcription (linguistics) error detection 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Artificial intelligence Error detection and correction business 010301 acoustics |
Zdroj: | INTERSPEECH INTERSPEECH 2020 INTERSPEECH 2020, Oct 2020, Shanghai, China |
DOI: | 10.21437/interspeech.2020-2242 |
Popis: | International audience; This work investigates semi-supervised training of acoustic models (AM) with the lattice-free maximum mutual information (LF-MMI) objective in practically relevant scenarios with a limited amount of labeled in-domain data. An error detection driven semi-supervised AM training approach is proposed, in which an error detector controls the hypothesized transcriptions or lattices used as LF-MMI training targets on additional unlabeled data. Under this approach, our first method uses a single error-tagged hypothesis whereas our second method uses a modified supervision lattice. These methods are evaluated and compared with existing semi-supervised AM training methods in three different matched or mismatched, limited data setups. Word error recovery rates of 28 to 89% are reported. |
Databáze: | OpenAIRE |
Externí odkaz: |