Výsledky vyhledávání - "Kögel, Fabian"

Report

Towards Robust FastSpeech 2 by Modelling Residual Multimodality

Autor: Kögel, Fabian, Nguyen, Bac, Cardinaux, Fabien

State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech. For expressive speech datasets however, we observe characteristic audio distortions. We demonstrate tha

Externí odkaz: http://arxiv.org/abs/2306.01442

Zobrazit plný text záznamu

Report

Multimodal Integration of Human-Like Attention in Visual Question Answering

Autor: Sood, Ekta, Kögel, Fabian, Müller, Philipp, Thomas, Dominike, Bace, Mihai, Bulling, Andreas

Human-like attention as a supervisory signal to guide neural attention has shown significant promise but is currently limited to uni-modal integration - even for inherently multimodal tasks such as visual question answering (VQA). We present the Mult

Externí odkaz: http://arxiv.org/abs/2109.13139

Zobrazit plný text záznamu

Report

VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering

Autor: Sood, Ekta, Kögel, Fabian, Strohm, Florian, Dhar, Prajit, Bulling, Andreas

We present VQA-MHUG - a novel 49-participant dataset of multimodal human gaze on both images and questions during visual question answering (VQA) collected using a high-speed eye tracker. We use our dataset to analyze the similarity between human and

Externí odkaz: http://arxiv.org/abs/2109.13116

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání