Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS
Autor: | Chen-Yu Chiang, I-Bin Liao, Sin-Horng Chen |
---|---|
Rok vydání: | 2016 |
Předmět: |
Computer science
Speech recognition Maximum likelihood Rate dependent Mandarin Chinese language.human_language Speaker diarisation 030507 speech-language pathology & audiology 03 medical and health sciences Maximum a posteriori estimation Range (statistics) language 0305 other medical science Speaker adaptation |
Zdroj: | ICASSP |
DOI: | 10.1109/icassp.2016.7472754 |
Popis: | In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker's data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker's data, but also was much better in the unseen SR range. |
Databáze: | OpenAIRE |
Externí odkaz: |