Automatic generation of synthesis units and prosodic information for Chinese concatenative synthesis
Autor: | Chung-Hsien Wu, Jau Hung Chen |
---|---|
Rok vydání: | 2001 |
Předmět: |
Linguistics and Language
Phrase business.industry Computer science Communication Speech recognition Intonation (linguistics) Speech synthesis computer.software_genre Language and Linguistics Computer Science Applications ComputingMethodologies_PATTERNRECOGNITION Modeling and Simulation Computer Vision and Pattern Recognition Artificial intelligence Concatenative synthesis Syllable business Prosody computer Software Word (computer architecture) Natural language processing Pitch contour |
Zdroj: | Speech Communication. 35:219-237 |
ISSN: | 0167-6393 |
DOI: | 10.1016/s0167-6393(00)00075-3 |
Popis: | In this paper, some approaches to the generation of synthesis units and prosodic information are proposed for Mandarin Chinese text-to-speech (TTS) conversion. The monosyllables are adopted as the basic synthesis units. A set of synthesis units is selected from a large continuous speech database based on two cost functions, which minimize the inter- and intra-syllable distortion. The speech database is also employed to establish a word-prosody-based template tree according to the linguistic features: tone combination, word length, part-of-speech (POS) of the word, and word position in a phrase. This template tree stores the prosodic features including pitch contour, average energy, and syllable duration of a word for possible combinations of linguistic features. Two modules for sentence intonation and template selection are proposed to generate the target prosodic templates. The experimental results showed that the synthesized prosodic features matched quite well with their original counterparts. Evaluation by subjective experiments also confirmed the satisfactory performance of these approaches. |
Databáze: | OpenAIRE |
Externí odkaz: |