Multiple resolution analysis for robust automatic speech recognition

Autor:	Renato De Mori, Franco Mana, Dario Albesano, Roberto Gemello
Rok vydání:	2006
Předmět:	Binary tree business.industry Computer science Speech recognition Noise reduction Dimensionality reduction Feature extraction Normalization (image processing) Word error rate Pattern recognition Theoretical Computer Science Human-Computer Interaction Filter design Principal component analysis Artificial intelligence business Software
Zdroj:	Computer Speech & Language. 20:2-21
ISSN:	0885-2308
DOI:	10.1016/j.csl.2004.06.001
Popis:	This paper investigates the potential of exploiting the redundancy implicit in multiple resolution analysis for automatic speech recognition systems. The analysis is performed by a binary tree of elements, each one of which is made by a half-band filter followed by a down sampler which discards odd samples. Filter design and feature computation from samples are discussed and recognition performance with different choices is presented. A paradigm consisting in redundant feature extraction, followed by feature normalization, followed by dimensionality reduction is proposed. Feature normalization is performed by denoising algorithms. Two of them are considered and evaluated, namely, signal-to-noise ratio-dependent spectral subtraction and soft thresholding. Dimensionality reduction is performed with principal component analysis. Experiments using telephone corpora and the Aurora3 corpus are reported. They indicate that the proposed paradigm leads to a recognition performance with clean speech, measured in word error rate, marginally superior to the one obtained with perceptual linear prediction coefficients. Nevertheless, performance of the proposed analysis paradigm is significantly superior when used with noisy data and the same denoising algorithm is applied to all the analysis methods, which are compared.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::de8b5da5804b160efb3bcdf1e6ab4001 https://doi.org/10.1016/j.csl.2004.06.001 Zobrazit plný text záznamu Full Text from ScienceDirect