An Efficient Solution to Sparse Linear Prediction Analysis of Speech
Autor: | Vahid Khanagha, Khalid Daoudi |
---|---|
Přispěvatelé: | Geometry and Statistics in acquisition data (GeoStat), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), BMC, Ed. |
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
Mathematical optimization
Acoustics and Ultrasonics Computational complexity theory Computer science Linear prediction Sparse approximation Function (mathematics) [INFO] Computer Science [cs] Residual Weighting [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing Convex optimization Electrical and Electronic Engineering Algorithm [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing Linear least squares |
Zdroj: | EURASIP Journal on Audio, Speech, and Music Processing EURASIP Journal on Audio, Speech, and Music Processing, 2013, 3, ⟨10.1186/1687-4722-2013-3⟩ |
ISSN: | 1687-4714 1687-4722 |
DOI: | 10.1186/1687-4722-2013-3⟩ |
Popis: | EURASIP Journal on Audio, Speech, and Music Processing, Special Issue on Sparse Modeling for Speech and Audio Processing; International audience; We propose an efficient closed-form solution to the problem of sparse linear prediction analysis of the speech signal. Our method is based on minimization of a weighted l2-norm of the prediction error. The weighting function is constructed such that less emphasis is given to the error around the points where we expect the largest prediction errors to occur (the glottal closure instants) and hence the resulting cost function approaches the ideal l0-norm cost function for sparse residual recovery. We show that the minimization of such a mathematically tractable objective function (by solving normal equations of linear least squares problem) provides enhanced sparsity level of residuals compared to the l1-norm minimization approach which uses the computationally demanding convex optimization methods. Indeed, the computational complexity of the proposed method is roughly the same as the classic minimum variance linear prediction analysis approach. Moreover, to show a potential application of such sparse representation, we use the resulting linear prediction coefficients inside a multi-pulse coder and show that the resulting coder achieves better coding quality compared to the classical Multi-pulse Excitation coder which uses the traditional minimum variance synthesizer. |
Databáze: | OpenAIRE |
Externí odkaz: |