Phase characteristics of vocal tract filter can distinguish speakers

Autor:	Masahiro Okada, Hiroshi Ito
Jazyk:	angličtina
Rok vydání:	2023
Předmět:	speaker recognition phase characteristics of vocal tract filter phase spectrogram averaged phase spectrum raw phase Applied mathematics. Quantitative methods T57-57.97 Probabilities. Mathematical statistics QA273-280
Zdroj:	Frontiers in Applied Mathematics and Statistics, Vol 9 (2023)
Druh dokumentu:	article
ISSN:	2297-4687
DOI:	10.3389/fams.2023.1274846
Popis:	IntroductionSpeaker recognition has been performed by considering individual variations in the power spectrograms of speech, which reflect the resonance phenomena in the speaker's vocal tract filter. In recent years, phase-based features have been used for speaker recognition. However, the phase-based features are not in a raw form of the phase but are crafted by humans, suggesting that the role of the raw phase is less interpretable. This study used phase spectrograms, which are calculated by subtracting the phase in the time-frequency domain of the electroglottograph signal from that of speech. The phase spectrograms represent the non-modified phase characteristics of the vocal tract filter.MethodsThe phase spectrograms were obtained from five Japanese participants. Phase spectrograms corresponding to vowels, called phase spectra, were then extracted and circular-averaged for each vowel. The speakers were determined based on the degree of similarity of the averaged spectra.ResultsThe accuracy of discriminating speakers using the averaged phase spectra was observed to be high although speakers were discriminated using only phase information without power. In particular, the averaged phase spectra showed different shapes for different speakers, resulting in the similarity between the different speaker spectrum pairs being lower. Therefore, the speakers were distinguished by using phase spectra.DiscussionThis predominance of phase spectra suggested that the phase characteristics of the vocal tract filter reflect the individuality of speakers.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/16fc63f013244088a8e1913ef56804b0 Zobrazit plný text záznamu View record in DOAJ