A Tool to Solve Sentence Segmentation Problem on Preparing Speech Database for Indonesian Text-to-speech System
Autor: | Fara Ayuningtyas, Lyla Ruslana Aini, Mohammad Teduh Uliniansyah, Elvira Nurfadhilah, Juliati Junde, Gunarso, Agung Santosa |
---|---|
Jazyk: | angličtina |
Předmět: |
Audio mining
Process (engineering) Computer science Speech recognition Speech synthesis 02 engineering and technology computer.software_genre TTS Task (project management) segmenting audio data 0202 electrical engineering electronic engineering information engineering Segmentation Sentence segmentation Syllable-timed General Environmental Science Bahasa Indonesia training data business.industry 020206 networking & telecommunications language.human_language Indonesian language General Earth and Planetary Sciences 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing |
Zdroj: | SLTU |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2016.04.048 |
Popis: | Creating a training data ready to be used for developing a text-to-speech (TTS) system can be a difficult task, since sometimes the recorded audio data is not the same with the prepared texts. To overcome differences between audio and text data, we developed a tool to segment audio data into sentences. As it is known, doing sentence segmentation of audio data manually needs efforts and resources. This paper presents a solution for alleviating problems encountered during segmentation process of audio data for developing an Indonesian TTS system. The tool was developed based on a fact that bahasa Indonesia is a syllable-timed language. We found that our tool reduces resources needed for segmenting Indonesian audio data. |
Databáze: | OpenAIRE |
Externí odkaz: |