Autor: |
Abhay Prasad, Vijitha Periyasamy, Prasanta Kumar Ghosh |
Rok vydání: |
2015 |
Předmět: |
|
Zdroj: |
ICASSP |
DOI: |
10.1109/icassp.2015.7178775 |
Popis: |
Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|