Utilizing Indonesian Allophones and Intraword Short Pauses Handling to Improve Performance of Indonesian Text-To-Speech
Autor: | Lyla Ruslana Aini, Asril Jarin, Fara Ayuningtyas, Made Gunawan, Elvira Nurfadhilah, Hammam Riza, Gunarso, Harnum Annisa, Agung Santosa, Mohammad Teduh Uliniansyah |
---|---|
Rok vydání: | 2018 |
Předmět: |
Text corpus
Computer science Speech recognition Speech synthesis Intelligibility (communication) computer.software_genre language.human_language Allophone Indonesian 030507 speech-language pathology & audiology 03 medical and health sciences Language development Naturalness medicine.anatomical_structure language Language center medicine 0305 other medical science computer |
Zdroj: | IALP |
Popis: | An allophone is a phoneme variant based on the position within a word, for instance, the first phoneme $e$ in “pendekar” is pronounced differently from the second phoneme e. According to Badan Pengembangan dan Pembinaan Bahasa (Language Development and Fostering Agency), formerly Pusat Bahasa (Language Center), Bahasa Indonesia has 5 vowels and 22 consonants, with 6 of them have allophones. There are only allophones of a phoneme (e)that can change the meaning of a word, while allophones of other 5 phonemes are not changing words' meanings. Therefore, most researches/projects on developing an Indonesian text-to-speech (TTS)system focus only on allophones of the phoneme e. This paper proposes a method to utilize all allophones of Bahasa Indonesia in developing a model for an Indonesian TTS system with a deep neural network (DNN)method. Furthermore, intraword short pause is also implemented to improve intelligibility and naturalness aspects. A set of rules are introduced to automatically detect allophones and intraword short pauses in the text corpus used in recording audio data. Using subjective and objective evaluations, the resulted TTS model shows a better result compared to one that not using allophones and intraword short pauses. |
Databáze: | OpenAIRE |
Externí odkaz: |