Automatic Quality Assessment for Audio-Visual Verification Systems. The LOVe submission to NIST SRE Challenge 2019
Autor: | Olivier Le Blouch, Gaël Le Lan, Grigory Antipov, Nicolas Gengembre |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Focus (computing) Sound (cs.SD) Modalities Quality assessment Computer science media_common.quotation_subject Speech recognition Universal model Computer Science - Sound Multimedia (cs.MM) Audio and Speech Processing (eess.AS) Face (geometry) Audio visual FOS: Electrical engineering electronic engineering information engineering NIST Quality (business) Computer Science - Multimedia Electrical Engineering and Systems Science - Audio and Speech Processing media_common |
Zdroj: | INTERSPEECH |
DOI: | 10.48550/arxiv.2008.05889 |
Popis: | Fusion of scores is a cornerstone of multimodal biometric systems composed of independent unimodal parts. In this work, we focus on quality-dependent fusion for speaker-face verification. To this end, we propose a universal model which can be trained for automatic quality assessment of both face and speaker modalities. This model estimates the quality of representations produced by unimodal systems which are then used to enhance the score-level fusion of speaker and face verification modules. We demonstrate the improvements brought by this quality-dependent fusion on the recent NIST SRE19 Audio-Visual Challenge dataset. Comment: 5 pages, 1 figure, accepted at INTERSPEECH 2020. Corrected the reference [20] |
Databáze: | OpenAIRE |
Externí odkaz: |