A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum
Autor: | J.O. Smith, Yoon Young Kim |
---|---|
Rok vydání: | 2003 |
Předmět: |
Computer science
business.industry Speech recognition Pattern recognition Statistical model Linear prediction Speech processing computer.software_genre Linear discriminant analysis Cepstrum Feature (machine learning) Artificial intelligence Image warping Hidden Markov model business computer Natural language processing |
Zdroj: | Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452). |
DOI: | 10.1109/aspaa.1999.810867 |
Popis: | We propose a new method of obtaining features from speech signals for robust analysis and recognition-the non-uniform linear prediction (NLP) cepstrum. The objective is to derive a representation that suppresses speaker-dependent characteristics while preserving the linguistic quality of speech segments. The analysis is based on two principles. First, Bark frequency warping is performed on the LP spectrum to emulate the auditory spectrum. While widely used methods such as the mel-frequency and PLP analysis use the FFT spectrum as its basis for warping, the NLP analysis uses the LP-based vocal-tract spectrum with glottal effects removed. Second, all-pole modeling (LP) is used before and after the warping. The pre-warp LP is used to first obtain the vocal-tract spectrum, while the post-warp LP is performed to obtain a smoothed, two-peak model of the warped spectrum. Experiments were conducted to test the effectiveness of the proposed feature in the case of identification/discrimination of vowels uttered by multiple speakers using linear discriminant analysis (LDA), and frame-based vowel recognition with a statistical model. In both cases, the NLP analysis was shown to be an effective tool for speaker-independent speech analysis/recognition applications. |
Databáze: | OpenAIRE |
Externí odkaz: |