Usage of Combinational Acoustic Models (DNN-HMM and SGMM) and Identifying the Impact of Language Models in Sinhala Speech Recognition

Autor: Randil Pushpananda, Ruvan Weerasinghe, Thilini Nadungodage, Buddhi Gamage
Rok vydání: 2020
Předmět:
Zdroj: 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).
DOI: 10.1109/icter51097.2020.9325439
Popis: Automatic Speech Recognition (ASR) is one of the most discussed areas in Natural Language Processing (NLP) because this technology is needed to enable and improve the interaction between human and the computer [1]. High resource languages like English have achieved state-of-the-art results, but when it comes to a low resource language like Sinhala it still needs to improve in a great scale. In this paper, it states about developing an ASR system for Sinhala language using Kaldi toolkit [2] and proposed combinational acoustic model of Subspace Gaussian Mixture Model (SGMM) and Deep Neural Network - Hidden Markov Model (DNN-HMM) has achieved 31.72% word-error-rate (WER). Additionally, we explore the impact of language models with using different corpora and preliminary comparison is provided between them.
Databáze: OpenAIRE