Filipino Speech to Text System using Convolutional Neural Network

Autor: John Vincent M. Dabo, Leonardo A. Samaniego, Stanley Glenn E. Brucal, Gabriel B. Dela Cruz, Einstein D. Yong, Jude Nico L. Martin, Kenneth Miguel L. Lagunay
Rok vydání: 2021
Předmět:
Zdroj: 2021 Fifth World Conference on Smart Trends in Systems Security and Sustainability (WorldS4).
Popis: Researchers found out that there are little to no studies about Closed Captioning involving the Tagalog Language. For the latest researches found by the researchers, the system performance rate, in terms of accuracy, ranges from 81% to 87% within the word bank of 35 words using the Tagalog language. Because of this, the system lacks the vocabulary to create sentences that can be used in documentaries, news, etc. The researchers decided to create a system with a higher accuracy rate and increased vocabulary implemented without the compromising of speed in speech to text system. This is made possible by convolutional neural network (CNN) in Python Language that uses Low Pass Filter, Librosa MFCC Feature Extraction and Keras Modeling. The system yielded an accuracy rating of 66.17% for male speakers, 81.64% for female speakers, 38.43% for accuracy with background noise, and 54.14% for 1kHz monotone. The speed of the system averaged at 1.47 seconds and with the vocabulary having 100% recognition rate. The Filipino Speech to Text system uses convolutional neural network in both its training and testing phase in which audio files are converted into 3d models that is used to recognize a word.
Databáze: OpenAIRE