'Spanish Políglota': an automatic Speech Recognition system based on HMM

Autor: Josafa Aguiar, Jonathan A. Zea
Rok vydání: 2021
Předmět:
Zdroj: 2021 Second International Conference on Information Systems and Software Technologies (ICI2ST).
Popis: The goal of this ASR system is to be able to recognize audio queries that request static translation of a given Spanish word into a specified language. We call this ASR system as the Spanish Poliglota. The pronunciation dictionary for the language model is obtained by applying grapheme to phoneme conversion. It was developed via Festival Speech Synthesis Scheme scripts and the SPPAS Spanish lexicon. The possible audio queries are restricted by a BNF grammar we designed for this project. A triphone acoustic model was generated from a set of 1621 words audio recordings. This acoustic model is based on a N-gram model that estimates its probabilities based on the maximum likelihood estimation MLE. We evaluated the prediction of individual words, as well as of synthetic phrases. We generated 1577 synthetic phrases concatenating the words of our audio set. The performance was also measured over a new set of audio recordings from a different speaker. Evaluation of isolated word recognition achieved 77.91% of correct predictions. Nevertheless, the performance dropped when evaluating the synthetic phrases as well as the second speaker’s speech. We consider it is an initial step towards the development of a fully functional automatic speech recognition system.
Databáze: OpenAIRE