Feature transformations for robust speech recognition in reverberant conditions

Autor: Raden S. Yuwana, Rika Sustika, Asri Rizki Yuliani, Hilman F. Pardede
Rok vydání: 2017
Předmět:
Zdroj: 2017 International Conference on Computer, Control, Informatics and its Applications (IC3INA).
DOI: 10.1109/ic3ina.2017.8251740
Popis: The problem of noise robustness remains one of main issues in automatic speech recognition (ASR) studies. The introduction of deep neural network (DNN) technologies, while significantly improves the ASR accuracies, still has not achieved a satisfactory performance. DNN, which proves to be able to discriminate features, is a promising technologies to improve the robustness of speech recognition. Therefore, the use of feature transformations may benefit the DNN-based ASR systems. In this paper, we compare several linear feature transformation techniques on several popular features in ASR: MFCC, PLP, and FBANK. The experiments are evaluated on the Meeting Recorder Digits (MRD) subset of Aurora-5 database, a set that is developed for evaluating robust speech recognition methods against reverberation. The features are evaluated on two types of ASR systems: HMM-GMM and HMM-DNN systems. The results indicate that the use of LDA and MLLT transformations, while generally reducing the error in clean conditions for all features, are more benefit on more primitive features such as FBANK in reverberant conditions. Their uses on more uncorrelated features such as MFCC and PLP are not necessarily effective in reverberant conditions. In such conditions, appending the information regarding the features changes during time, i.e. using Delta features, is better.
Databáze: OpenAIRE