Feature transformations for robust speech recognition in reverberant conditions
Autor: | Raden S. Yuwana, Rika Sustika, Asri Rizki Yuliani, Hilman F. Pardede |
---|---|
Rok vydání: | 2017 |
Předmět: |
Reverberation
Artificial neural network Computer science Speech recognition Feature extraction Feature transformation 02 engineering and technology Uncorrelated 030507 speech-language pathology & audiology 03 medical and health sciences Robustness (computer science) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Mel-frequency cepstrum 0305 other medical science Hidden Markov model |
Zdroj: | 2017 International Conference on Computer, Control, Informatics and its Applications (IC3INA). |
DOI: | 10.1109/ic3ina.2017.8251740 |
Popis: | The problem of noise robustness remains one of main issues in automatic speech recognition (ASR) studies. The introduction of deep neural network (DNN) technologies, while significantly improves the ASR accuracies, still has not achieved a satisfactory performance. DNN, which proves to be able to discriminate features, is a promising technologies to improve the robustness of speech recognition. Therefore, the use of feature transformations may benefit the DNN-based ASR systems. In this paper, we compare several linear feature transformation techniques on several popular features in ASR: MFCC, PLP, and FBANK. The experiments are evaluated on the Meeting Recorder Digits (MRD) subset of Aurora-5 database, a set that is developed for evaluating robust speech recognition methods against reverberation. The features are evaluated on two types of ASR systems: HMM-GMM and HMM-DNN systems. The results indicate that the use of LDA and MLLT transformations, while generally reducing the error in clean conditions for all features, are more benefit on more primitive features such as FBANK in reverberant conditions. Their uses on more uncorrelated features such as MFCC and PLP are not necessarily effective in reverberant conditions. In such conditions, appending the information regarding the features changes during time, i.e. using Delta features, is better. |
Databáze: | OpenAIRE |
Externí odkaz: |