Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition

Autor:	Saikat Chatterjee, W B Kleijn
Rok vydání:	2011
Předmět:	Acoustics and Ultrasonics Computer science business.industry Speech recognition Feature vector media_common.quotation_subject Data_MISCELLANEOUS Feature extraction Pattern recognition Dynamic feature Human auditory system Computer Science::Sound Computational auditory scene analysis Perception Model-based design Artificial intelligence Mel-frequency cepstrum Electrical and Electronic Engineering business media_common
Zdroj:	IEEE Transactions on Audio, Speech, and Language Processing. 19:1813-1825
ISSN:	1558-7924 1558-7916
DOI:	10.1109/tasl.2010.2101597
Popis:	Using spectral and spectro-temporal auditory models along with perturbation-based analysis, we develop a new framework to optimize a feature vector such that it emulates the behavior of the human auditory system. The optimization is carried out in an offline manner based on the conjecture that the local geometries of the feature vector domain and the perceptual auditory domain should be similar. Using this principle along with a static spectral auditory model, we modify and optimize the static spectral mel frequency cepstral coefficients (MFCCs) without considering any feedback from the speech recognition system. We then extend the work to include spectro-temporal auditory properties into designing a new dynamic spectro-temporal feature vector. Using a spectro-temporal auditory model, we design and optimize the dynamic feature vector to incorporate the behavior of human auditory response across time and frequency. We show that a significant improvement in automatic speech recognition (ASR) performance is obtained for any environmental condition, clean as well as noisy.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::185efc3db0ffc1cfda66161eaa38191c https://doi.org/10.1109/tasl.2010.2101597 Zobrazit plný text záznamu