Robust Audio Speaker Segmentation using One Class SVMs

Autor: Kadri, Hachem, Davy, Manuel, Rabaoui, Asma, Lachiri, Zied, Ellouze, Noureddine
Přispěvatelé: Unité de Recherche Signal, Image et Reconnaissance de Formes, Ecole Nationale d'Ingénieurs de Tunis (ENIT), Université de Tunis El Manar (UTM)-Université de Tunis El Manar (UTM), LAGIS-SI, Laboratoire d'Automatique, Génie Informatique et Signal (LAGIS), Université de Lille, Sciences et Technologies-Centrale Lille-Centre National de la Recherche Scientifique (CNRS)-Université de Lille, Sciences et Technologies-Centrale Lille-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2008
Předmět:
Zdroj: Proceedings of the EURASIP EUSIPCO'08
European Signal Processing Conference (EUSIPCO-2008)
European Signal Processing Conference (EUSIPCO-2008), 2008, Switzerland. http://www.eurasip.org/Proceedings/Eusipco/Eusipco2008/index.html
Popis: International audience; This paper presents a new technique for segmenting an audio stream into pieces, each one contains speeches of only one speaker. Speaker segmentation has been used extensively in various tasks such as automatic transcription of radio broadcast news and audio indexing. The segmentation method used in this paper is based on a discriminative distance measure between two adjacent sliding windows operating on preprocessed speech. The proposed unsupervised detection method which does not require any pre-trained models is based on the use of the exponential family model and 1-SVMs to approximate the generalized likelihood ratio. Our 1-SVM-based segmentation algorithm provides improvements over baseline approaches which use the Bayesian Information Criterion (BIC). The segmentation results achieved in our experiments illustrate the potential of this method in detecting speaker changes in audio streams containing overlapped and short speeches.
Databáze: OpenAIRE