Implementation and Complexity Reduction for Scalable Speech Coders
Autor: | Mu-Liang Wang, 王木良 |
---|---|
Rok vydání: | 2005 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 93 In this dissertation, we proposed several algorithms to improve the performance of scalable speech coder and reduce the computational complexity. First, a fast search algorithm for the CELP speech coders is proposed to reduce the computational complexity. Second, two spectral estimation schemes and two quantization schemes are proposed to effectively estimate and to quantize the spectral envelope of parametric speech coder, respectively. Finally, a classified LPC quantization scheme is proposed to quantize the LSF vector and to achieve transparent quantization of classified LPC parameters. The searching of stochastic codebook of CELP speech coder, which is based on the analysis-by-synthesis (AbS) search mechanism, requires a huge computational effort. To further reduce the computational complexity, we proposed a generalized candidate (GC) scheme. Theoretical analyses and experimental results demonstrate that the proposed GC scheme incorporated with the multi-pulse maximum likelihood quantization (MP-MLQ) scheme of MPEG-4 CELP coder enables a reduction of over 50% of the computational load. Combined with the depth-first-tree search (DFTS) scheme in the 3GPP narrowband adaptive multi-rate speech coder (AMR-NB), the number of search loops involved in ACELP codebook search has been reduced by a factor about 4. In both case, the degradation of reconstructed speech quality is perceptually intangible. The harmonic modeling has been widely adopted in low rate parametric speech coders. To efficiently encode the spectral envelope parameter is an essential issue in the harmonic speech coder. We propose two spectral estimation algorithms to estimate the spectral amplitudes and to refine the fractional pitch lag of speech signal. To estimate the parameter of speech signal with time-varying characteristics, the precise spectral estimation approach is proposed based on a time-varying sinusoidal model (TSM) and the spectral distortion is reduced by approximately 1.69 dB. Another fast spectral estimation scheme is proposed with low complexity consideration and the computational task involved in spectral estimation is reduced more than 70%. The informal listening test confirms that there is virtually no detectable quality difference between the original estimation scheme and the proposed fast scheme. To effectively quantize the spectral envelope vector, a spectral envelope quantization scheme based on human hearing properties is proposed. The proposed hearing-based spectral envelope vector quantization (HSEVQ) scheme quantize the spectral envelope vector based on the minimum Bark spectral distortion (MBSD) criterion. A simplified HSEVQ (SSEVQ) scheme is developed to reduce the complexity of the computation. The theoretical analyses and simulations results reveal that the SSEVQ method reduces the amount of computation of the traditional SE vector quantization scheme by a factor of nine, while retaining the quality of the reconstructed speech signal. Finally, a classified LPC quantization (CLPQ) scheme is proposed to quantize the classified LSF vector at a minimum bit rate. With an objective spectral distortion measure, the CLPQ scheme achieves transparent quantization of the unvoiced speech spectra information with 10 bits and the voiced with 21 bits for each 20 ms frame. The proposed CLPQ scheme could encode the LPC coefficients with variable bit rate and computational scalability. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |