Source-specific Informative Prior for i-Vector Extraction

Autor: Søren Holdt Jensen, Haizhou Li, Zheng-Hua Tan, Kong Aik Lee, Sven Ewan Shepstone
Jazyk: angličtina
Rok vydání: 2015
Předmět:
Zdroj: Shepstone, S E, Lee, K A, Li, H, Tan, Z-H & Jensen, S H 2015, Source-specific Informative Prior for i-Vector Extraction . in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015 . IEEE Signal Processing Society, I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings, pp. 4185-4189, 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015, Brisbane, Australia, 19/04/2015 . https://doi.org/10.1109/ICASSP.2015.7178759
ICASSP
DOI: 10.1109/ICASSP.2015.7178759
Popis: An i-vector is a low-dimensional fixed-length representation of a variable-length speech utterance, and is defined as the posterior mean of a latent variable conditioned on the observed feature sequence of an utterance. The assumption is that the prior for the latent variable is non-informative, since for homogeneous datasets there is no gain in generality in using an informative prior. This work shows that extracting i-vectors for a heterogeneous dataset, containing speech samples recorded from multiple sources, using informative priors instead is applicable, and leads to favorable results. Tests carried out on the NIST 2008 and 2010 Speaker Recognition Evaluation (SRE) dataset show that our proposed method beats three baselines: For the short2-short3 core-task in SRE'08, for the female and male cases, five and six respectively, out of eight common conditions were beaten, and for the core-core task in SRE'10, for both genders, five out of nine common conditions were beaten.
Databáze: OpenAIRE