Noise Robust Speech Recognition Using Deep Belief Networks

Autor:	Mahboubeh Farahat, Ramin Halavati
Rok vydání:	2016
Předmět:	Voice activity detection Computer science business.industry Speech recognition Acoustic model 020206 networking & telecommunications Pattern recognition 02 engineering and technology Speech processing Speaker recognition Computer Science Applications Theoretical Computer Science Deep belief network Computer Science::Sound 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 020201 artificial intelligence & image processing Mel-frequency cepstrum Artificial intelligence Hidden Markov model business Software
Zdroj:	International Journal of Computational Intelligence and Applications. 15:1650005
ISSN:	1757-5885 1469-0268
Popis:	Most current speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. In these systems acoustic inputs are represented by Mel Frequency Cepstral Coefficients temporal spectrogram known as frames. But MFCC is not robust to noise. Consequently, with different train and test conditions the accuracy of speech recognition systems decreases. On the other hand, using MFCCs of larger window of frames in GMMs needs more computational power. In this paper, Deep Belief Networks (DBNs) are used to extract discriminative information from larger window of frames. Nonlinear transformations lead to high-order and low-dimensional features which are robust to variation of input speech. Multiple speaker isolated word recognition tasks with 100 and 200 words in clean and noisy environments has been used to test this method. The experimental results indicate that this new method of feature encoding result in much better word recognition accuracy.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1b945e9b0f4c7fd1e84bc2db5a9b56c4 https://doi.org/10.1142/s146902681650005x Zobrazit plný text záznamu