Autor: |
Nursuriati Jamil, Muhammad Izzad Ramli, Noraini Seman |
Jazyk: |
angličtina |
Rok vydání: |
2015 |
Předmět: |
|
Zdroj: |
Journal of Electrical Systems, Vol 11, Iss 3, Pp 308-318 (2015) |
ISSN: |
1112-5209 |
Popis: |
Sentence boundary detection (SBD), also known as sentence segmentation decides where a sentence begins and ends. Previous method of SBD is either done by linguistic approach or acoustic approach; or combination of both approaches. Even though linguistic approach generally performed better than acoustic approach, it requires the need of a speech recognition component. This is a constraint for Under Resource Languages such as the Malay language. This paper describes the SBD for spontaneous Malay language spoken audio. Experiments are conducted on a forty-two minutes question-answer (Q/A) Malaysia parliamentary session comprising 12 adult male speakers and 4 female speakers. The speech datasets are first classified as speech/non-speech segments and only the non-speech segments are further tested as candidates of sentence boundaries. Seven prosodic features, rate-of-speech and volume are then extracted from the boundary candidates for classification. Our proposed SBD method using supervised Adaboost classifier managed a promising100% accuracy rate with 19.44% error rate. For future work, we intend to reduce the error rate by implementing end-point detection on the boundary candidates. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|