Automatic Speaker Identification Using Clinically Depressed Speech Content

Autor: Sheeraz Memon, Faisal Karim Shaikh, Javed Ali Baloch
Jazyk: angličtina
Rok vydání: 2012
Předmět:
Zdroj: Mehran University Research Journal of Engineering and Technology, Vol 31, Iss 2, Pp 259-264 (2012)
Druh dokumentu: article
ISSN: 0254-7821
2413-7219
Popis: The environment affects largely the performance of automatic speaker recognition. This work investigates the effects of clinical environment on the task of speaker recognition. For this task we have used two sets of speakers, a clinical set which consists of speech samples from 70 clinically depressed speakers and a control set which comprises of 68 clinically non-depressed speakers. The MFCCs (Mel Frequency Cepstral Coefficients) are applied for feature extraction, and a number of modeling methods such as GMM-EM (Gaussian Mixture Models Based on Expectation Maximization), GMM based on Kmeans (GMM-Kmeans), GMM-LBG based on Linde Buzo Gray, and GMM -ITVQ based on Information Theoretic Vector Quantization are used. The different modeling methods are evaluated for the novel speech corpus. The results suggest that the speaker recognition rates for the depressed speakers are lower (60-71%) than for the non-depressed speakers (79-89%). This paper further investigate the performance of VQ (Vector Quantization) based Gaussian modeling, and proposes a novel approach called GMM-ITVQ. The results suggest that GMM-EM has the higher recognition rates however, the performance of GMMITVQ is comparable to GMM-EM.
Databáze: Directory of Open Access Journals