Online Batch Normalization Adaptation for Automatic Speech Recognition

Autor: Roberto Gemello, Franco Mana, Felix Weninger, Puming Zhan
Rok vydání: 2019
Předmět:
Zdroj: ASRU
DOI: 10.1109/asru46091.2019.9003883
Popis: Deep Neural Network (DNN) acoustic models are sensitive to the mismatch between training and testing environments. When a trained model is tested on unseen speakers, domain, or environment, recognition accuracy can degrade substantially. In such a case, offline adaptation with a fair amount of field data can improve recognition accuracy significantly, and is commonly applied to ASR systems in practice. Ideally, such kind of adaptation should be done online as well in order to catch any unexpected dynamic changes in the environments during the inference process. However, online adaptation is subject to strict constraints on computational cost. On the other hand, the small amount of available data and the nature of unsupervised adaptation make online adaptation a very challenging task, especially for DNN acoustic models which normally contain millions of parameters. In this paper, we introduce a simple and effective online adaptation technique to compensate training and testing mismatch for DNN acoustic models. It is done via online adaptation of the parameters associated with the batch normalization applied to the model training process. Our results show that this technique can improve accuracy significantly in a domain mismatched scenario for different DNN architectures.
Databáze: OpenAIRE