An integrated top-down/bottom-up approach to speaker diarization

Autor: Bozonnet, S., Evans, N., Fredouille, C., Wang, D., Raphaël Troncy
Přispěvatelé: Bozonnet, Simon, Eurecom [Sophia Antipolis], Laboratoire Informatique d'Avignon (LIA), Centre d'Enseignement et de Recherche en Informatique - CERI-Avignon Université (AU), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
Jazyk: angličtina
Rok vydání: 2010
Předmět:
Zdroj: Interspeech 2010, September 26-30, Makuhari, Japan
Interspeech 2010, September 26-30, Makuhari, Japan, Sep 2010, Makuhari, Tokyo, Japan. Interspeech 2010, September 26-30, Makuhari, Japan
Scopus-Elsevier
Popis: International audience; Most speaker diarization systems fit into one of two cat- egories: bottom-up or top-down. Bottom-up systems are the most popular but can sometimes suffer from instability from merging and stopping criteria difficulties. Top-down systems deliver competitive results but are particularly prone to poor model initialization which often leads to large variations in performance. This paper presents a new integrated bottom-up/top-down approach to speaker diarization which aims to harness the strengths of each system and thus to improve performance and stability. In contrast to previous work, here the two systems are fused at the heart of the segmentation and clustering stage. Experimental results show improvements in speaker diarization performance for both meeting and TV-show domain data indicating increased intra and inter-domain stability. On the TV-show data in particular, an average relative improvement of 32% DER is obtained.
Databáze: OpenAIRE