Comparison of different Acoustic Models for Kannada language using Kaldi Toolkit

Autor:	T Sahana, K. Jeeva Priya, Naga Srilasya, Deepa Gupta, S. Vinay
Rok vydání:	2018
Předmět:	021110 strategic defence & security studies Computer science Speech recognition 0211 other engineering and technologies Word error rate 020206 networking & telecommunications 02 engineering and technology Mixture model Data modeling Data set 0202 electrical engineering electronic engineering information engineering Noise (video) Mel-frequency cepstrum Hidden Markov model
Zdroj:	ICACCI
Popis:	This paper describes a speech recognition system for the South Indian language, Kannada using Kaldi toolkit. KALDI is a open source toolkit based on Finite State Transducers (FST's). Two speech data sets has been collected from 10 different speakers (5 male and 5 female). The first data set consists of a digit corpora in Kannada where each speaker has spoken a number ten times and the second data set consists of simple Kannada phrases. The noise to a large extent has been filtered manually and the data has been segmented using the software application Audacity(v2.2.2). The main objective is to compare the word error rate (WER) of the two data sets using different acoustic models in Gaussian Mixture Models(GMM) and Sub-Gaussian Mixture Model(SGMM). The WER for Gaussian Mixture Model and Subspace Gaussian Mixture Model for the first data set is 4.54% and 4.27% respectively and for the second data set the WER for GMM and SGMM is 12.27% and 13%.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::fb640a0bf1b9c5a4904eefb9b49d364f https://doi.org/10.1109/icacci.2018.8554586 Zobrazit plný text záznamu