'Spanish Políglota': an automatic Speech Recognition system based on HMM
Autor: | Josafa Aguiar, Jonathan A. Zea |
---|---|
Rok vydání: | 2021 |
Předmět: |
050101 languages & linguistics
Computer science Speech recognition 05 social sciences Acoustic model Speech synthesis 02 engineering and technology Pronunciation Lexicon computer.software_genre Triphone Word recognition 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Language model Hidden Markov model computer |
Zdroj: | 2021 Second International Conference on Information Systems and Software Technologies (ICI2ST). |
Popis: | The goal of this ASR system is to be able to recognize audio queries that request static translation of a given Spanish word into a specified language. We call this ASR system as the Spanish Poliglota. The pronunciation dictionary for the language model is obtained by applying grapheme to phoneme conversion. It was developed via Festival Speech Synthesis Scheme scripts and the SPPAS Spanish lexicon. The possible audio queries are restricted by a BNF grammar we designed for this project. A triphone acoustic model was generated from a set of 1621 words audio recordings. This acoustic model is based on a N-gram model that estimates its probabilities based on the maximum likelihood estimation MLE. We evaluated the prediction of individual words, as well as of synthetic phrases. We generated 1577 synthetic phrases concatenating the words of our audio set. The performance was also measured over a new set of audio recordings from a different speaker. Evaluation of isolated word recognition achieved 77.91% of correct predictions. Nevertheless, the performance dropped when evaluating the synthetic phrases as well as the second speaker’s speech. We consider it is an initial step towards the development of a fully functional automatic speech recognition system. |
Databáze: | OpenAIRE |
Externí odkaz: |