Výsledky vyhledávání - "Manivannan, Mithun"

Report

EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation

Autor: Manivannan, Mithun, Nethrapalli, Vignesh, Cartwright, Mark

Recent progress in audio-language modeling, such as automated audio captioning, has benefited from training on synthetic data generated with the aid of large-language models. However, such approaches for environmental sound captioning have primarily

Externí odkaz: http://arxiv.org/abs/2410.12028

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání