Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006
Autor: | Frantisek Grezl, Martin Karafiat, Jan Cernocky, Ondrej Glembek, Petr Schwarz, Pavel Matejka, Niko Brümmer, D.A. van Leeuwen, Albert Strasheim, Lukas Burget |
---|---|
Přispěvatelé: | TNO Defensie en Veiligheid |
Rok vydání: | 2007 |
Předmět: |
Acoustics and Ultrasonics
Computer science Speech recognition Cognitive neuroscience of visual object recognition Linear prediction Object recognition Vectors Speaker recognition Speech processing Support vector machine Gaussian mixture model (GMM) Communication channels (information theory) Magnetostrictive devices Nuisance attribute projection (NAP) NIST Eigenchannel Mel-frequency cepstrum Electrical and Electronic Engineering Fusion Image retrieval Maximum likelihood |
Zdroj: | IEEE Transactions on Audio, Speech, and Language Processing. 15:2072-2084 |
ISSN: | 1558-7924 1558-7916 |
DOI: | 10.1109/tasl.2007.902870 |
Popis: | This paper describes and discusses the "STBU" speaker recognition system, which performed well in the NIST Speaker Recognition Evaluation 2006 (SRE). STBU is a consortium of four partners: Spescom DataVoice (Stellenbosch, South Africa), TNO (Soesterberg, The Netherlands), BUT (Brno, Czech Republic), and the University of Stellenbosch (Stellenbosch, South Africa). The STBU system was a combination of three main kinds of subsystems: 1) GMM, with short-time Mel frequency cepstral coefficient (MFCC) or perceptual linear prediction (PLP) features, 2) Gaussian mixture model-support vector machine (GMM-SVM), using GMM mean supervectors as input to an SVM, and 3) maximum-likelihood linear regression-support vector machine (MLLR-SVM), using MLLR speaker adaptation coefficients derived from an English large vocabulary continuous speech recognition (LVCSR) system. All subsystems made use of supervector subspace channel compensation methods-either eigenchannel adaptation or nuisance attribute projection. We document the design and performance of all subsystems, as well as their fusion and calibration via logistic regression. Finally, we also present a cross-site fusion that was done with several additional systems from other NIST SRE-2006 participants. © 2006 IEEE. |
Databáze: | OpenAIRE |
Externí odkaz: |