Výsledky vyhledávání - "Ikemiya, Yukara"

Report

VRVQ: Variable Bitrate Residual Vector Quantization for Audio Compression

Autor: Chae, Yunkee, Choi, Woosung, Takida, Yuhta, Koo, Junghyun, Ikemiya, Yukara, Zhong, Zhi, Cheuk, Kin Wai, Martínez-Ramírez, Marco A., Lee, Kyogu, Liao, Wei-Hsiang, Mitsufuji, Yuki

Recent state-of-the-art neural audio compression models have progressively adopted residual vector quantization (RVQ). Despite this success, these models employ a fixed number of codebooks per frame, which can be suboptimal in terms of rate-distortio

Externí odkaz: http://arxiv.org/abs/2410.06016

Zobrazit plný text záznamu

Report

SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond

Autor: Comunità, Marco, Zhong, Zhi, Takahashi, Akira, Yang, Shiqi, Zhao, Mengjie, Saito, Koichi, Ikemiya, Yukara, Shibuya, Takashi, Takahashi, Shusuke, Mitsufuji, Yuki

Recent advances in generative models that iteratively synthesize audio clips sparked great success to text-to-audio synthesis (TTA), but with the cost of slow synthesis speed and heavy computation. Although there have been attempts to accelerate the

Externí odkaz: http://arxiv.org/abs/2406.17672

Zobrazit plný text záznamu

Report

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

Autor: Zhang, Yixiao, Ikemiya, Yukara, Choi, Woosung, Murata, Naoki, Martínez-Ramírez, Marco A., Lin, Liwei, Xia, Gus, Liao, Wei-Hsiang, Mitsufuji, Yuki, Dixon, Simon

Recent advances in text-to-music editing, which employ text queries to modify music (e.g.\ by changing its style or adjusting instrumental components), present unique challenges and opportunities for AI-assisted music creation. Previous approaches in

Externí odkaz: http://arxiv.org/abs/2405.18386

Zobrazit plný text záznamu

Report

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

Autor: Zhang, Yixiao, Ikemiya, Yukara, Xia, Gus, Murata, Naoki, Martínez-Ramírez, Marco A., Liao, Wei-Hsiang, Mitsufuji, Yuki, Dixon, Simon

Recent advances in text-to-music generation models have opened new avenues in musical creativity. However, music generation usually involves iterative refinements, and how to edit the generated music remains a significant challenge. This paper introd

Externí odkaz: http://arxiv.org/abs/2402.06178

Zobrazit plný text záznamu

Report

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Autor: Takida, Yuhta, Ikemiya, Yukara, Shibuya, Takashi, Shimada, Kazuki, Choi, Woosung, Lai, Chieh-Hsin, Murata, Naoki, Uesaka, Toshimitsu, Uchida, Kengo, Liao, Wei-Hsiang, Mitsufuji, Yuki

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical structures for m

Externí odkaz: http://arxiv.org/abs/2401.00365

Zobrazit plný text záznamu

Report

Automatic Piano Transcription with Hierarchical Frequency-Time Transformer

Autor: Toyama, Keisuke, Akama, Taketo, Ikemiya, Yukara, Takida, Yuhta, Liao, Wei-Hsiang, Mitsufuji, Yuki

Taking long-term spectral and temporal dependencies into account is essential for automatic piano transcription. This is especially helpful when determining the precise onset and offset for each note in the polyphonic piano content. In this case, we

Externí odkaz: http://arxiv.org/abs/2307.04305

Zobrazit plný text záznamu

Report

Singing Voice Separation and Vocal F0 Estimation based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation

Autor: Ikemiya, Yukara, Itoyama, Katsutoshi, Yoshii, Kazuyoshi

This paper presents a new method of singing voice analysis that performs mutually-dependent singing voice separation and vocal fundamental frequency (F0) estimation. Vocal F0 estimation is considered to become easier if singing voices can be separate

Externí odkaz: http://arxiv.org/abs/1604.00192

Zobrazit plný text záznamu