Usage of Combinational Acoustic Models (DNN-HMM and SGMM) and Identifying the Impact of Language Models in Sinhala Speech Recognition
Autor: | Randil Pushpananda, Ruvan Weerasinghe, Thilini Nadungodage, Buddhi Gamage |
---|---|
Rok vydání: | 2020 |
Předmět: |
Subspace Gaussian Mixture Model
Artificial neural network Computer science Speech recognition Acoustic model 020206 networking & telecommunications 02 engineering and technology 01 natural sciences Data modeling Resource (project management) 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Language model Mel-frequency cepstrum Hidden Markov model 010301 acoustics |
Zdroj: | 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer). |
DOI: | 10.1109/icter51097.2020.9325439 |
Popis: | Automatic Speech Recognition (ASR) is one of the most discussed areas in Natural Language Processing (NLP) because this technology is needed to enable and improve the interaction between human and the computer [1]. High resource languages like English have achieved state-of-the-art results, but when it comes to a low resource language like Sinhala it still needs to improve in a great scale. In this paper, it states about developing an ASR system for Sinhala language using Kaldi toolkit [2] and proposed combinational acoustic model of Subspace Gaussian Mixture Model (SGMM) and Deep Neural Network - Hidden Markov Model (DNN-HMM) has achieved 31.72% word-error-rate (WER). Additionally, we explore the impact of language models with using different corpora and preliminary comparison is provided between them. |
Databáze: | OpenAIRE |
Externí odkaz: |