Autor: |
Hiroi, Jun, Tokuda, Keiichi, Masuko, Takashi, Kobayashi, Takao, Kitamura, Tadashi |
Předmět: |
|
Zdroj: |
Systems & Computers in Japan; 11/15/2001, Vol. 32 Issue 12, p38-46, 9p |
Abstrakt: |
In this paper the authors describe very low bit rate speech coding based on HMMs (Hidden Markov Models). In the coder, phoneme recognition is performed using HMMs, then the phoneme index array, state duration length, and pitch information are sent to the decoder. In the decoder, the phoneme HMMs are linked in accordance with the phoneme index array. Then, a mel-cepstral array is generated from the linked HMMs based on a speech parameter generation algorithm using likelihood maximization standards in accordance with the state duration length. Finally, speech is synthesized by exciting an MLSA (Mel Log Spectrum Approximation) filter with the generated mel-cepstral as a coefficient in accordance with the pitch information. The results of subjective evaluation experiments show that performance similar to a vocoder based on 400 bit/s (8 bit/frame × 50 frame/s) vector quantization excluding pitch information can be obtained using the proposed format at 146 bits/s (including a 26% silent interval) excluding pitch information. © 2001 Scripta Technica, Syst Comp Jpn, 32(12): 38–46, 2001 [ABSTRACT FROM AUTHOR] |
Databáze: |
Supplemental Index |
Externí odkaz: |
|