Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Manivannan, Mithun"'
Recent progress in audio-language modeling, such as automated audio captioning, has benefited from training on synthetic data generated with the aid of large-language models. However, such approaches for environmental sound captioning have primarily
Externí odkaz:
http://arxiv.org/abs/2410.12028