Pronunciation Training on Isolated Kannada Words using 'Kannada Kali' - A Cloud Based Smart Phone Application

Autor: Ankit Anand, Avinash Kumar, Ajay Cholin, Dinkar Sitaram, Akshay Venkatesh, Lingaraj Kothiwale, Viraj Kumar, Savitha Murthy, Ankita Shetty, Aditya D. Bhat
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: IndraStra Global.
ISSN: 2381-3652
Popis: Automated feedback on pronunciation system on a smart phone is useful for a student trying to learn a new language at his or her own pace. The objective of our research is to implement a pronunciation training system with minimal language specific data. Our proposed system consists of an Android application as a front-end, and a pronunciation evaluation and mispronunciation detection framework as the back -end hosted on a cloud. We conduct our experiments on spoken isolated words in Kannada. Our pronunciation evaluation (for spoken word) implementation on the cloud involves training a classifier with features from Dynamic Time Warping (DTW) with Mel Frequency Cepstral Coefficients (MFCC) and Line Spectral Frequencies (LSF) and, without directly on LSF (without DTW). We study the performance of different machine learning algorithms for pronunciation rating. We propose a novel semi supervised approach for detecting mispronounced segments of a word using Self Organizing Maps (SOM) that are also deployed on the cloud. Our implementation of SOM learns the features of an automatically segmented reference speech. The trained SOM is then used to determine the deviations in the learner's pronunciation. We evaluate our system on 1169 Kannada audio samples from students around 18 to 25 years of age. The Kannada words considered are taken from textbooks of first and second grade (considering learners as beginners who do not know Kannada) and include 2 to 5 syllable words. We report accuracy on binary classification and multi -class classification for different classifiers. The mispronounced segments detected using SOM correlate with the human ratings. Our approach of pronunciation evaluation and mispronunciation detection is based on minimal data and does not require a speech recognition system.
Databáze: OpenAIRE