Development of the compact English LVCSR acoustic model for embedded entertainment robot applications

Autor:	Ajay Patrikar, Xavier Menendez-Pidal, Lex Olorenshaw, Hitoshi Honda
Rok vydání:	2007
Předmět:	Linguistics and Language Vocabulary Entertainment robot Computer science business.industry Speech recognition media_common.quotation_subject Decision tree Acoustic model Machine learning computer.software_genre Triphone Language and Linguistics Human-Computer Interaction Search engine Computer Science::Sound Redundancy (engineering) Computer Vision and Pattern Recognition Artificial intelligence Hidden Markov model business computer Software media_common
Zdroj:	International Journal of Speech Technology. 10:63-74
ISSN:	1572-8110 1381-2416
DOI:	10.1007/s10772-008-9012-6
Popis:	In this paper we discuss two techniques to reduce the size of the acoustic model while maintaining or improving the accuracy of the recognition engine. The first technique, demiphone modeling, tries to reduce the redundancy existing in a context dependent state-clustered Hidden Markov Model (HMM). Three-state demiphones optimally designed from the triphone decision tree are introduced to drastically reduce the phone space of the acoustic model and to improve system accuracy. The second redundancy elimination technique is a more classical approach based on parameter tying. Similar vectors of variances in each HMM cluster are tied together to reduce the number of parameters. The closeness between the vectors of variances is measured using a Vector Quantizer (VQ) to maintain the information provided by the variances parameters. The paper also reports speech recognition improvements using assignment of variable number Gaussians per cluster and gender-based HMMs. The main motivation behind these techniques is to improve the acoustic model and at the same time lower its memory usage. These techniques may help in reducing memory and improving accuracy of an embedded Large Vocabulary Continuous Speech Recognition (LVCSR) application.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::b323e35a77e20392a3cf6a96f9b2437d https://doi.org/10.1007/s10772-008-9012-6 Zobrazit plný text záznamu Full text from SpringerLink