Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information

Autor:	José Manuel Pardo, Chuck Wooters, Xavier Anguera
Rok vydání:	2007
Předmět:	Speaker diarisation Sentence boundary disambiguation Computational Theory and Mathematics Hardware and Architecture Computer science Microphone Speech recognition Source separation NIST Transcription (software) Hidden Markov model Software Theoretical Computer Science
Zdroj:	IEEE Transactions on Computers. 56:1212-1224
ISSN:	0018-9340
DOI:	10.1109/tc.2007.1077
Popis:	Human-machine interaction in meetings requires the localization and identification of the speakers interacting with the system as well as the recognition of the words spoken. A seminal step toward this goal is the field of rich transcription research, which includes speaker diarization together with the annotation of sentence boundaries and the elimination of speaker disfluencies. The sub-area of speaker diarization attempts to identify the number of participants in a meeting and create a list of speech time intervals for each such participant. In this paper, we analyze the correlation between signals coming from multiple microphones and propose an improved method for carrying out speaker diarization for meetings with multiple distant microphones. The proposed algorithm makes use of acoustic information and information from the delays between signals coming from the different sources. Using this procedure, we were able to achieve state-of-the-art performance in the NIST spring 2006 rich transcription evaluation, improving the Diarization Error Rate (DER) by 15% to 20% relative to previous systems.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::d2e148c94f128cd8a03fbcd054fedfba https://doi.org/10.1109/tc.2007.1077 Zobrazit plný text záznamu