Improved DNN-HMM English Acoustic Model Specially For Phonotactic Language Recognition

Autor:	Rui-Li Du, Ying Yin, He Jing, Huang Yubin, Ying-Xin Gan, Ya-Nan Li, Liu Jianzhong, Hai-Feng Yan, Ruan Ting, Wei-Wei Liu, Li Wei, Guo-Chun Li, Zhang Shengge, Wei Liu, Hua-ying Bai, Yan-Miao Song, Jian Hua Zhou, Cun-Xue Zhang
Rok vydání:	2019
Předmět:	Phonotactics 030507 speech-language pathology & audiology 03 medical and health sciences Phone Computer science Speech recognition Word error rate NIST Acoustic model TIMIT 0305 other medical science Hidden Markov model Cluster analysis
Zdroj:	IALP
Popis:	The now-acknowledged sensitive of Phonotactic Language Recognition (PLR) to the performance of the phone recognizer front-end have spawned interests to develop many methods to improve it. In this paper, improved Deep Neural Networks Hidden Markov Model (DNN-HMM) English acoustic model front-end specially for phonotactic language recognition is proposed, and series of methods like dictionary merging, phoneme splitting, phoneme clustering, state clustering and DNN-HMM acoustic modeling (DPPSD) are introduced to balance the generalization and the accusation of the speech tokenizing processing in PLR. Experiments are carried out on the database of National Institute of Standards and Technology language recognition evaluation 2009 (NIST LRE 2009). It is showed that the DPPSD English acoustic model based phonotactic language recognition system yields 2.09%, 6.60%, 19.72% for 30s, 10s, 3s in equal error rate (EER) by applying the state-of-the-art techniques, which outperforms the language recognition results on both TIMIT and CMU dictionary and other phoneme clustering methods.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::27a39785203679470e092dc0ae65199a https://doi.org/10.1109/ialp48816.2019.9037696 Zobrazit plný text záznamu