Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Liao, Weihsiang"'
Autor:
Liao, WeiHsiang, Takida, Yuhta, Ikemiya, Yukara, Zhong, Zhi, Lai, Chieh-Hsin, Fabbro, Giorgio, Shimada, Kazuki, Toyama, Keisuke, Cheuk, Kinwai, Martínez-Ramírez, Marco A., Takahashi, Shusuke, Uhlich, Stefan, Akama, Taketo, Choi, Woosung, Koyama, Yuichiro, Mitsufuji, Yuki
We demonstrate the efficacy of using intermediate representations from a single foundation model to enhance various music downstream tasks. We introduce SoniDo, a music foundation model (MFM) designed to extract hierarchical features from target musi
Externí odkaz:
http://arxiv.org/abs/2411.01135
In the realm of audio watermarking, it is challenging to simultaneously encode imperceptible messages while enhancing the message capacity and robustness. Although recent advancements in deep learning-based methods bolster the message capacity and ro
Externí odkaz:
http://arxiv.org/abs/2406.03822
Autor:
Fabbro, Giorgio, Uhlich, Stefan, Lai, Chieh-Hsin, Choi, Woosung, Martínez-Ramírez, Marco, Liao, Weihsiang, Gadelha, Igor, Ramos, Geraldo, Hsu, Eddie, Rodrigues, Hugo, Stöter, Fabian-Robert, Défossez, Alexandre, Luo, Yi, Yu, Jianwei, Chakraborty, Dipam, Mohanty, Sharada, Solovyev, Roman, Stempkovskiy, Alexander, Habruseva, Tatiana, Goswami, Nabarun, Harada, Tatsuya, Kim, Minseok, Lee, Jun Hyung, Dong, Yuanliang, Zhang, Xinran, Liu, Jiafeng, Mitsufuji, Yuki
Publikováno v:
Transactions of the International Society for Music Information Retrieval, 7(1), pp.63-84, 2024
This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust music source separation (MSS), i.e., training MSS models in the presence of er
Externí odkaz:
http://arxiv.org/abs/2308.06979
Autor:
Takida, Yuhta, Shibuya, Takashi, Liao, WeiHsiang, Lai, Chieh-Hsin, Ohmura, Junki, Uesaka, Toshimitsu, Murata, Naoki, Takahashi, Shusuke, Kumakura, Toshiyuki, Mitsufuji, Yuki
One noted issue of vector-quantized variational autoencoder (VQ-VAE) is that the learned discrete representation uses only a fraction of the full capacity of the codebook, also known as codebook collapse. We hypothesize that the training scheme of VQ
Externí odkaz:
http://arxiv.org/abs/2205.07547