Estimation of the invariant and variant characteristics in speech articulation and its application to speaker identification

Autor:	Abhay Prasad, Vijitha Periyasamy, Prasanta Kumar Ghosh
Rok vydání:	2015
Předmět:	Motor theory of speech perception Speech production Speech perception Computer science Speech recognition Speech corpus Speaker recognition Speech processing Manner of articulation Speaker diarisation Speech motor Articulatory gestures Electrical Engineering Vocal tract Utterance
Zdroj:	ICASSP
DOI:	10.1109/icassp.2015.7178775
Popis:	Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::596473ca99c2264e3aaf43caa485c33e https://doi.org/10.1109/icassp.2015.7178775 Zobrazit plný text záznamu