Estimation of the invariant and variant characteristics in speech articulation and its application to speaker identification

Autor: Abhay Prasad, Vijitha Periyasamy, Prasanta Kumar Ghosh
Rok vydání: 2015
Předmět:
Zdroj: ICASSP
DOI: 10.1109/icassp.2015.7178775
Popis: Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.
Databáze: OpenAIRE