Výsledky vyhledávání - "Benetos, Emmanouil"

Report

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model

Autor: Huang, Jiawen, Benetos, Emmanouil

Multilingual automatic lyrics transcription (ALT) is a challenging task due to the limited availability of labelled data and the challenges introduced by singing, compared to multilingual automatic speech recognition. Although some multilingual singi

Externí odkaz: http://arxiv.org/abs/2406.17618

Zobrazit plný text záznamu

Report

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind pro

Externí odkaz: http://arxiv.org/abs/2405.19327

Zobrazit plný text záznamu

Report

Explaining models relating objects and privacy

Autor: Xompero, Alessio, Bontonou, Myriam, Arbona, Jean-Michel, Benetos, Emmanouil, Cavallaro, Andrea

Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image

Externí odkaz: http://arxiv.org/abs/2405.01646

Zobrazit plný text záznamu

Report

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Autor: Deng, Qixin, Yang, Qikai, Yuan, Ruibin, Huang, Yipeng, Wang, Yi, Liu, Xubo, Tian, Zeyue, Pan, Jiahao, Zhang, Ge, Lin, Hanfeng, Li, Yizhi, Ma, Yinghao, Fu, Jie, Lin, Chenghua, Benetos, Emmanouil, Wang, Wenwu, Xia, Guangyu, Xue, Wei, Guo, Yike

Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM

Externí odkaz: http://arxiv.org/abs/2404.18081

Zobrazit plný text záznamu

Report

MuPT: A Generative Symbolic Music Pretrained Transformer

In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Nota

Externí odkaz: http://arxiv.org/abs/2404.06393

Zobrazit plný text záznamu

Report

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

Autor: Liang, Jinhua, Nolasco, Ines, Ghani, Burooj, Phan, Huy, Benetos, Emmanouil, Stowell, Dan

Detecting the presence of animal vocalisations in nature is essential to study animal populations and their behaviors. A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims t

Externí odkaz: http://arxiv.org/abs/2403.18638

Zobrazit plný text záznamu

Report

Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models

Autor: Postolache, Emilian, Mariani, Giorgio, Cosmo, Luca, Benetos, Emmanouil, Rodolà, Emanuele

Multi-Source Diffusion Models (MSDM) allow for compositional musical generation tasks: generating a set of coherent sources, creating accompaniments, and performing source separation. Despite their versatility, they require estimating the joint distr

Externí odkaz: http://arxiv.org/abs/2403.11706

Zobrazit plný text záznamu

Report

WavCraft: Audio Editing and Generation with Large Language Models

Autor: Liang, Jinhua, Zhang, Huan, Liu, Haohe, Cao, Yin, Kong, Qiuqiang, Liu, Xubo, Wang, Wenwu, Plumbley, Mark D., Phan, Huy, Benetos, Emmanouil

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing. Specifically, WavCraft describes the content of raw audio materials in natural lang

Externí odkaz: http://arxiv.org/abs/2403.09527

Zobrazit plný text záznamu

Report

ChatMusician: Understanding and Generating Music Intrinsically with LLM

While Large Language Models (LLMs) demonstrate impressive capabilities in text generation, we find that their ability has yet to be generalized to music, humanity's creative language. We introduce ChatMusician, an open-source LLM that integrates intr

Externí odkaz: http://arxiv.org/abs/2402.16153

Zobrazit plný text záznamu

Report

A Data-Driven Analysis of Robust Automatic Piano Transcription

Autor: Edwards, Drew, Dixon, Simon, Benetos, Emmanouil, Maezawa, Akira, Kusaka, Yuta

Algorithms for automatic piano transcription have improved dramatically in recent years due to new datasets and modeling techniques. Recent developments have focused primarily on adapting new neural network architectures, such as the Transformer and

Externí odkaz: http://arxiv.org/abs/2402.01424

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání