Zobrazeno 1 - 10
of 3 904
pro vyhledávání: '"Wei, Hsiang"'
Autor:
Zhao, Mengjie, Zhong, Zhi, Mao, Zhuoyuan, Yang, Shiqi, Liao, Wei-Hsiang, Takahashi, Shusuke, Wakaki, Hiromi, Mitsufuji, Yuki
We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. To construct OpenMU-Bench, we leveraged existing datasets and bootstrapped new annotations. Open
Externí odkaz:
http://arxiv.org/abs/2410.15573
Autor:
Chae, Yunkee, Choi, Woosung, Takida, Yuhta, Koo, Junghyun, Ikemiya, Yukara, Zhong, Zhi, Cheuk, Kin Wai, Martínez-Ramírez, Marco A., Lee, Kyogu, Liao, Wei-Hsiang, Mitsufuji, Yuki
Recent state-of-the-art neural audio compression models have progressively adopted residual vector quantization (RVQ). Despite this success, these models employ a fixed number of codebooks per frame, which can be suboptimal in terms of rate-distortio
Externí odkaz:
http://arxiv.org/abs/2410.06016
Autor:
Hiranaka, Ayano, Chen, Shang-Fu, Lai, Chieh-Hsin, Kim, Dongjun, Murata, Naoki, Shibuya, Takashi, Liao, Wei-Hsiang, Sun, Shao-Hua, Mitsufuji, Yuki
Controllable generation through Stable Diffusion (SD) fine-tuning aims to improve fidelity, safety, and alignment with human guidance. Existing reinforcement learning from human feedback methods usually rely on predefined heuristic reward functions o
Externí odkaz:
http://arxiv.org/abs/2410.05116
This paper presents a novel approach to deter unauthorized deepfakes and enable user tracking in generative models, even when the user has full access to the model parameters, by integrating key-based model authentication with watermarking techniques
Externí odkaz:
http://arxiv.org/abs/2409.07743
Autor:
Mancusi, Michele, Halychanskyi, Yurii, Cheuk, Kin Wai, Lai, Chieh-Hsin, Uhlich, Stefan, Koo, Junghyun, Martínez-Ramírez, Marco A., Liao, Wei-Hsiang, Fabbro, Giorgio, Mitsufuji, Yuki
Music timbre transfer is a challenging task that involves modifying the timbral characteristics of an audio signal while preserving its melodic structure. In this paper, we propose a novel method based on dual diffusion bridges, trained using the Coc
Externí odkaz:
http://arxiv.org/abs/2409.06096
DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
Autor:
Luo, Yin-Jyun, Cheuk, Kin Wai, Choi, Woosung, Uesaka, Toshimitsu, Toyama, Keisuke, Saito, Koichi, Lai, Chieh-Hsin, Takida, Yuhta, Liao, Wei-Hsiang, Dixon, Simon, Mitsufuji, Yuki
Existing work on pitch and timbre disentanglement has been mostly focused on single-instrument music audio, excluding the cases where multiple instruments are presented. To fill the gap, we propose DisMix, a generative framework in which the pitch an
Externí odkaz:
http://arxiv.org/abs/2408.10807
Autor:
Lee, Sungho, Martínez-Ramírez, Marco, Liao, Wei-Hsiang, Uhlich, Stefan, Fabbro, Giorgio, Lee, Kyogu, Mitsufuji, Yuki
We present GRAFX, an open-source library designed for handling audio processing graphs in PyTorch. Along with various library functionalities, we describe technical details on the efficient parallel computation of input graphs, signals, and processor
Externí odkaz:
http://arxiv.org/abs/2408.03204
Autor:
Chen, Yu-Hua, Choi, Woosung, Liao, Wei-Hsiang, Martínez-Ramírez, Marco, Cheuk, Kin Wai, Mitsufuji, Yuki, Jang, Jyh-Shing Roger, Yang, Yi-Hsuan
Recent years have seen increasing interest in applying deep learning methods to the modeling of guitar amplifiers or effect pedals. Existing methods are mainly based on the supervised approach, requiring temporally-aligned data pairs of unprocessed a
Externí odkaz:
http://arxiv.org/abs/2406.15751
Autor:
Lee, Sungho, Martínez-Ramírez, Marco A., Liao, Wei-Hsiang, Uhlich, Stefan, Fabbro, Giorgio, Lee, Kyogu, Mitsufuji, Yuki
Music mixing is compositional -- experts combine multiple audio processors to achieve a cohesive mix from dry source tracks. We propose a method to reverse engineer this process from the input and output audio. First, we create a mixing console that
Externí odkaz:
http://arxiv.org/abs/2406.01049
Autor:
Zhang, Yixiao, Ikemiya, Yukara, Choi, Woosung, Murata, Naoki, Martínez-Ramírez, Marco A., Lin, Liwei, Xia, Gus, Liao, Wei-Hsiang, Mitsufuji, Yuki, Dixon, Simon
Recent advances in text-to-music editing, which employ text queries to modify music (e.g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation. Previous approaches in
Externí odkaz:
http://arxiv.org/abs/2405.18386