Výsledky vyhledávání

Report

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

Autor: Tian, Zeyue, Liu, Zhaoyang, Yuan, Ruibin, Pan, Jiahao, Huang, Xiaoqiang, Liu, Qifeng, Tan, Xu, Chen, Qifeng, Xue, Wei, Guo, Yike

In this work, we systematically study music generation conditioned solely on the video. First, we present a large-scale dataset comprising 190K video-music pairs, including various genres such as movie trailers, advertisements, and documentaries. Fur

Externí odkaz: http://arxiv.org/abs/2406.04321

Zobrazit plný text záznamu

Report

LLMs Meet Multimodal Generation and Editing: A Survey

Autor: He, Yingqing, Liu, Zhaoyang, Chen, Jingye, Tian, Zeyue, Liu, Hongyu, Chi, Xiaowei, Liu, Runtao, Yuan, Ruibin, Xing, Yazhou, Wang, Wenhai, Dai, Jifeng, Zhang, Yong, Xue, Wei, Liu, Qifeng, Guo, Yike, Chen, Qifeng

With the recent advancement in large language models (LLMs), there is a growing interest in combining LLMs with multimodal learning. Previous surveys of multimodal large language models (MLLMs) mainly focus on multimodal understanding. This survey el

Externí odkaz: http://arxiv.org/abs/2405.19334

Zobrazit plný text záznamu

Report

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Autor: Deng, Qixin, Yang, Qikai, Yuan, Ruibin, Huang, Yipeng, Wang, Yi, Liu, Xubo, Tian, Zeyue, Pan, Jiahao, Zhang, Ge, Lin, Hanfeng, Li, Yizhi, Ma, Yinghao, Fu, Jie, Lin, Chenghua, Benetos, Emmanouil, Wang, Wenwu, Xia, Guangyu, Xue, Wei, Guo, Yike

Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM

Externí odkaz: http://arxiv.org/abs/2404.18081

Zobrazit plný text záznamu

Report

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Autor: Xing, Yazhou, He, Yingqing, Tian, Zeyue, Wang, Xintao, Chen, Qifeng

Video and audio content creation serves as the core technique for the movie industry and professional users. Recently, existing diffusion-based methods tackle video and audio generation separately, which hinders the technique transfer from academia t

Externí odkaz: http://arxiv.org/abs/2402.17723

Zobrazit plný text záznamu

Report

ChatMusician: Understanding and Generating Music Intrinsically with LLM

While Large Language Models (LLMs) demonstrate impressive capabilities in text generation, we find that their ability has yet to be generalized to music, humanity's creative language. We introduce ChatMusician, an open-source LLM that integrates intr

Externí odkaz: http://arxiv.org/abs/2402.16153

Zobrazit plný text záznamu

Report

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

In the era of extensive intersection between art and Artificial Intelligence (AI), such as image generation and fiction co-creation, AI for music remains relatively nascent, particularly in music understanding. This is evident in the limited work on

Externí odkaz: http://arxiv.org/abs/2306.10548

Zobrazit plný text záznamu

Report

Mixed Neural Voxels for Fast Multi-view Video Synthesis

Autor: Wang, Feng, Tan, Sinan, Li, Xinghang, Tian, Zeyue, Song, Yafei, Liu, Huaping

Synthesizing high-fidelity videos from real-world multi-view input is challenging because of the complexities of real-world environments and highly dynamic motions. Previous works based on neural radiance fields have demonstrated high-quality reconst

Externí odkaz: http://arxiv.org/abs/2212.00190

Zobrazit plný text záznamu

Periodical

Deep Cascade Gradient RBF Networks With Output-Relevant Feature Extraction and Adaptation for Nonlinear and Nonstationary Processes

Autor: Liu, Tong, Tian, Zeyue, Chen, Sheng, Wang, Kai, Harris, Chris J.

Publikováno v: IEEE Transactions on Cybernetics; August 2023, Vol. 53 Issue: 8 p4908-4922, 15p

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání