Speaker Diarization: A Top-Down Approach Using Syllabic Phonology

Autor: David Suendermann-Oeft, Amanda L. Robinson, Mark Miller, Michael Brenndoerfer, Erik Edwards, Nico Axtmann, Greg P. Finley, Maxim Korenevsky, Najmeh Sadoughi
Rok vydání: 2018
Předmět:
Zdroj: Speech and Computer ISBN: 9783319995786
SPECOM
Popis: A top-down approach to speaker diarization is developed using a modified Baum-Welch algorithm. The HMM states combine phonemes according to structural positions under syllabic phonological theory. By nature of the structural phonology, there are at most 16 states, and the transition matrix is sparse, allowing efficient decoding to structural phones. This addresses the issue of phoneme specificity in speaker diarization – that speaker similarities/differences are confounded by phonetic similarities/differences. We address this here without the expensive use of a complete set of individual phonemes. The voice activity detection (VAD) issue is likewise addressed, giving a new approach to VAD.
Databáze: OpenAIRE