Zobrazeno 1 - 10
of 66
pro vyhledávání: '"Choi, Keunwoo"'
Autor:
Tailleur, Modan, Lee, Junwon, Lagrange, Mathieu, Choi, Keunwoo, Heller, Laurie M., Imoto, Keisuke, Okamoto, Yuki
This paper explores whether considering alternative domain-specific embeddings to calculate the Fr\'echet Audio Distance (FAD) metric can help the FAD to correlate better with perceptual ratings of environmental sounds. We used embeddings from VGGish
Externí odkaz:
http://arxiv.org/abs/2403.17508
The equitable distribution of academic data is crucial for ensuring equal research opportunities, and ultimately further progress. Yet, due to the complexity of using the API for audio data that corresponds to the Million Song Dataset along with its
Externí odkaz:
http://arxiv.org/abs/2308.16389
Automatic music captioning, which generates natural language descriptions for given music tracks, holds significant potential for enhancing the understanding and organization of large volumes of musical data. Despite its importance, researchers face
Externí odkaz:
http://arxiv.org/abs/2307.16372
In this work, we address the challenge of lyrics alignment, which involves aligning the lyrics and vocal components of songs. This problem requires the alignment of two distinct modalities, namely text and audio. To overcome this challenge, we propos
Externí odkaz:
http://arxiv.org/abs/2307.04377
To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and def
Externí odkaz:
http://arxiv.org/abs/2307.04292
In real-world acoustic scenarios, there often are multiple sound sources present in a room. These sources are situated in various locations and produce sounds that reach the listener from multiple directions. The presence of multiple sources in a roo
Externí odkaz:
http://arxiv.org/abs/2305.15898
Autor:
Choi, Keunwoo, Im, Jaekwon, Heller, Laurie, McFee, Brian, Imoto, Keisuke, Okamoto, Yuki, Lagrange, Mathieu, Takamichi, Shinosuke
The addition of Foley sound effects during post-production is a common technique used to enhance the perceived acoustic properties of multimedia content. Traditionally, Foley sound has been produced by human Foley artists, which involves manual recor
Externí odkaz:
http://arxiv.org/abs/2304.12521
We introduce a framework that recommends music based on the emotions of speech. In content creation and daily life, speech contains information about human emotions, which can be enhanced by music. Our framework focuses on a cross-domain retrieval sy
Externí odkaz:
http://arxiv.org/abs/2303.10539
Autor:
Cheuk, Kin Wai, Choi, Keunwoo, Kong, Qiuqiang, Li, Bochen, Won, Minz, Wang, Ju-Chiang, Hung, Yun-Ning, Herremans, Dorien
In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of an instrument recognition module t
Externí odkaz:
http://arxiv.org/abs/2302.00286
This paper introduces effective design choices for text-to-music retrieval systems. An ideal text-based retrieval system would support various input queries such as pre-defined tags, unseen tags, and sentence-level descriptions. In reality, most prev
Externí odkaz:
http://arxiv.org/abs/2211.14558