RESTRICTED DOMAIN MALAY SPEECH SYNTHESIZER USING SYNTAX-PROSODY REPRESENTATION
Autor: | Rosni Abdullah, Tang Enya Kong, Sabrina Tiun |
---|---|
Rok vydání: | 2012 |
Předmět: |
Computer Networks and Communications
business.industry Computer science Speech recognition Speech corpus Speech synthesis computer.software_genre Syntax Tree (data structure) Artificial Intelligence Artificial intelligence business Prosody computer Software Sentence Natural language processing Utterance |
Zdroj: | Journal of Computer Science. 8:1961-1969 |
ISSN: | 1549-3636 |
DOI: | 10.3844/jcssp.2012.1961.1969 |
Popis: | The speech synthesis approach required in restricted domain speech application is a synthesizer that has high quality like the speech output of ‘slot-filler’ approach but have at least the least flexibility of the ‘genuine’ speech synthesizer. Thus, in this research study, we propose an alternative approach of creating a speech synthesizer to be used in a restricted domain speech application. In our approach, we use word unit as the primary unit and our speech corpus is represented by syntax-prosody tree structures. Speech synthesis is performed by constructing a syntax-prosody tree of a target input sentence. The construction of the tree is by done by adapting an example-based syntactic parsing approach and the concatenated of synthesis units from the constructed tree nodes will be the synthesized utterance. For evaluation, we performed MOS subjective evaluation on our speech synthesizer with natural speech and two other Malay TTS system. Based on an ANOVA and T-Tests analysis, we found the overall MOS scores of our speech synthesizer output, sound B was (mean = 3.34, sd = 1.10), the other two Malay TTS system; C (mean = 1.95, sd = 0.72) and D (mean = 1.80, sd = 1.04) and the natural speech, A (mean = 4.71, sd = 0.21). We conclude that our Malay speech synthesizer sounded more natural, easier to listen, more pleasant and more fluent compared to the sounds of the other two Malay TTS systems. As expected, the recorded speech was perceived more natural than the output of our Malay speech synthesizer. |
Databáze: | OpenAIRE |
Externí odkaz: |