Speaker Diarization and Linking of Meeting Data

Autor: Marc Ferras, Srikanth Madikeri, Hervé Bourlard
Rok vydání: 2016
Předmět:
Zdroj: IEEE/ACM Transactions on Audio, Speech, and Language Processing. 24:1935-1945
ISSN: 2329-9304
2329-9290
DOI: 10.1109/taslp.2016.2590139
Popis: Finding who spoke when in a collection of recordings, with speakers being uniquely identified across the database, is a challenging task. In this scenario, reasonable computing times and acoustic variation across recordings remain two major concerns to address in state-of-the-art speaker diarization systems. This paper extends prior work on diarizing large speech datasets using algorithms that scale well with increasing amounts of data while compensating for across-recording variability. We follow a two-stage approach performing speaker diarization and speaker linking, the former focusing on local within-recording speaker changes and the latter focusing on global speaker changes across the database. In this study, we explore how these two modules interact with each other, while proposing a diarization fusion approach that prevents diarization errors from propagating to the linking stage. We further explore the diarization fusion for speaker linking using different linking strategies and speaker modeling variants. Evaluation is performed on single distant microphone data from the augmented multiparty interaction corpus show the effectiveness of the fusion approach after speaker linking and intersession variability modeling via joint factor analysis.
Databáze: OpenAIRE