Usage of Combinational Acoustic Models (DNN-HMM and SGMM) and Identifying the Impact of Language Models in Sinhala Speech Recognition

Autor:	Randil Pushpananda, Ruvan Weerasinghe, Thilini Nadungodage, Buddhi Gamage
Rok vydání:	2020
Předmět:	Subspace Gaussian Mixture Model Artificial neural network Computer science Speech recognition Acoustic model 020206 networking & telecommunications 02 engineering and technology 01 natural sciences Data modeling Resource (project management) 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Language model Mel-frequency cepstrum Hidden Markov model 010301 acoustics
Zdroj:	2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).
DOI:	10.1109/icter51097.2020.9325439
Popis:	Automatic Speech Recognition (ASR) is one of the most discussed areas in Natural Language Processing (NLP) because this technology is needed to enable and improve the interaction between human and the computer [1]. High resource languages like English have achieved state-of-the-art results, but when it comes to a low resource language like Sinhala it still needs to improve in a great scale. In this paper, it states about developing an ASR system for Sinhala language using Kaldi toolkit [2] and proposed combinational acoustic model of Subspace Gaussian Mixture Model (SGMM) and Deep Neural Network - Hidden Markov Model (DNN-HMM) has achieved 31.72% word-error-rate (WER). Additionally, we explore the impact of language models with using different corpora and preliminary comparison is provided between them.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::a8dfe4057b05fd235eae13e6deabdd27 https://doi.org/10.1109/icter51097.2020.9325439 Zobrazit plný text záznamu