Improved Neural Network Based Acoustic Modeling Leveraging Multi-task Learning and Ensemble Learning for Meeting Speech Recognition

Autor: Yang, Ming-Han, 楊明翰
Rok vydání: 2016
Druh dokumentu: 學位論文 ; thesis
Popis: 104
This thesis sets out to explore the use of multi-task learning (MTL) and ensemble learning techniques for more accurate estimation of the parameters involved in neural network based acoustic models, so as to improve the accuracy of meeting speech recognition. Our main contributions are three-fold. First, we conduct an empirical study to leverage various auxiliary tasks to enhance the performance of multi-task learning on meeting speech recognition. Furthermore, we also study the synergy effect of combing multi-task learning with disparate acoustic models, such as deep neural network (DNN) and convolutional neural network (CNN) based acoustic models, with the expectation to increase the generalization ability of acoustic modeling. Second, since the way to modulate the contribution (weights) of different auxiliary tasks during acoustic model training is far from optimal and actually a matter of heuristic judgment, we thus propose a simple model adaptation method to alleviate such a problem. Third, an ensemble learning method is investigated to systematically integrate the various acoustic models (weak learners) trained with multi-task learning. A series of experiments have been carried out on the augmented multi-party interaction (AMI) and Mandarin meeting recording (MMRC) corpora, which seem to reveal the effectiveness of our proposed methods in relation to several existing baselines.
Databáze: Networked Digital Library of Theses & Dissertations