Speaker Diarization: A Top-Down Approach Using Syllabic Phonology
Autor: | David Suendermann-Oeft, Amanda L. Robinson, Mark Miller, Michael Brenndoerfer, Erik Edwards, Nico Axtmann, Greg P. Finley, Maxim Korenevsky, Najmeh Sadoughi |
---|---|
Rok vydání: | 2018 |
Předmět: |
Voice activity detection
Computer science Speech recognition 020206 networking & telecommunications Phonology 02 engineering and technology Speaker diarisation 030507 speech-language pathology & audiology 03 medical and health sciences 0202 electrical engineering electronic engineering information engineering Syllabic verse Syllable 0305 other medical science Set (psychology) Hidden Markov model Decoding methods |
Zdroj: | Speech and Computer ISBN: 9783319995786 SPECOM |
Popis: | A top-down approach to speaker diarization is developed using a modified Baum-Welch algorithm. The HMM states combine phonemes according to structural positions under syllabic phonological theory. By nature of the structural phonology, there are at most 16 states, and the transition matrix is sparse, allowing efficient decoding to structural phones. This addresses the issue of phoneme specificity in speaker diarization – that speaker similarities/differences are confounded by phonetic similarities/differences. We address this here without the expensive use of a complete set of individual phonemes. The voice activity detection (VAD) issue is likewise addressed, giving a new approach to VAD. |
Databáze: | OpenAIRE |
Externí odkaz: |