Generation of broadband speech from narrowband speech based on linear mapping
Autor: | Mineo Tsushima, Yoshihisa Nakatoh, Takeshi Norimatsu |
---|---|
Rok vydání: | 2002 |
Předmět: | |
Zdroj: | Electronics and Communications in Japan (Part II: Electronics). 85:44-53 |
ISSN: | 1520-6432 8756-663X |
DOI: | 10.1002/ecjb.10065 |
Popis: | In this paper, a method for generating broadband speech from band-limited speech using spectral linear mapping is proposed. This method is based on LPC analysis and synthesis. It first extracts sound source information (residual waveforms) and vocal tract information (spectral envelopes) from input speech, performs linear mapping of the vocal tract information and nonlinear processing of the sound source information for broadband conversion, and finally generates broadband speech from both by LPC synthesis. The spectral envelope is made broadband by linear mapping, the spectral space is divided into a number of subspaces, and a narrowband spectrum is converted into a broadband spectrum by a conversion (convert) matrix of each subspace. A conversion (convert) matrix is estimated using training speech such that the mean square error between the postconversion spectrum and the target broadband spectrum is minimized. Spectral distortions of the proposed method were compared experimentally with codebook mapping and neural network methods and it has been verified that the proposed method with linear mapping gives performance not inferior to that of the other two methods. In addition, the proposed method has been verified to have the effect of giving a band feeling in subject evaluation tests. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 85(8): 44–53, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjb.10065 |
Databáze: | OpenAIRE |
Externí odkaz: |