The UMD-JHU 2011 speaker recognition system

Autor:	Sri Harish Mallidi, Yuancheng Luo, Padmanabhan Rajan, Dmitry N. Zotkin, Sridhar Krishna Nemala, Ramani Duraiswami, Balaji Vasan Srinivasan, Xinhui Zhou, Shihab A. Shamma, Hynek Hermansky, Mounya Elhilali, Sriram Ganapathy, Samuel Thomas, Gsvs Sivaram, Daniel Garcia-Romero, Majid Mirbagheri, Thomas Janu, Nima Mesgarani
Rok vydání:	2012
Předmět:	Reverberation Computer science business.industry Speech recognition Pattern recognition Linear prediction Speaker recognition Discriminative model Robustness (computer science) Vocal effort NIST Mel-frequency cepstrum Artificial intelligence business
Zdroj:	ICASSP
DOI:	10.1109/icassp.2012.6288852
Popis:	In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems. The primary focus of many recent developments have shifted to the problem of recognizing speakers in adverse conditions, e.g in the presence of noise/reverberation. In this paper, we present the UMD-JHU speaker recognition system applied on the NIST 2010 SRE task. The novel aspects of our systems are: 1) Improved performance on trials involving different vocal effort via the use of linear-scale features; 2) Expected improved recognition performance in the presence of reverberation and noise via the use of frequency domain perceptual linear predictor and cortical features; 3) A new discriminative kernel partial least squares (KPLS) framework that complements state-of-the-art back-end systems JFA and PLDA to aid in better overall recognition; and 4) Acceleration of JFA, PLDA and KPLS back-ends via distributed computing. The individual components of the system and the fused system are compared against a baseline JFA system and results reported by SRI and MIT-LL on SRE2010.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e9c56c7e2c7be4ef08795befcaf1bb35 https://doi.org/10.1109/icassp.2012.6288852 Zobrazit plný text záznamu