A Tool to Solve Sentence Segmentation Problem on Preparing Speech Database for Indonesian Text-to-speech System

Autor:	Fara Ayuningtyas, Lyla Ruslana Aini, Mohammad Teduh Uliniansyah, Elvira Nurfadhilah, Juliati Junde, Gunarso, Agung Santosa
Jazyk:	angličtina
Předmět:	Audio mining Process (engineering) Computer science Speech recognition Speech synthesis 02 engineering and technology computer.software_genre TTS Task (project management) segmenting audio data 0202 electrical engineering electronic engineering information engineering Segmentation Sentence segmentation Syllable-timed General Environmental Science Bahasa Indonesia training data business.industry 020206 networking & telecommunications language.human_language Indonesian language General Earth and Planetary Sciences 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing
Zdroj:	SLTU
ISSN:	1877-0509
DOI:	10.1016/j.procs.2016.04.048
Popis:	Creating a training data ready to be used for developing a text-to-speech (TTS) system can be a difficult task, since sometimes the recorded audio data is not the same with the prepared texts. To overcome differences between audio and text data, we developed a tool to segment audio data into sentences. As it is known, doing sentence segmentation of audio data manually needs efforts and resources. This paper presents a solution for alleviating problems encountered during segmentation process of audio data for developing an Indonesian TTS system. The tool was developed based on a fact that bahasa Indonesia is a syllable-timed language. We found that our tool reduces resources needed for segmenting Indonesian audio data.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::81f06cc4dd507cbc822dee1f470e6b9d Zobrazit plný text záznamu