Výsledky vyhledávání - "Choi, Keunwoo"

Report

Correlation of Fr\'echet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant

Autor: Tailleur, Modan, Lee, Junwon, Lagrange, Mathieu, Choi, Keunwoo, Heller, Laurie M., Imoto, Keisuke, Okamoto, Yuki

This paper explores whether considering alternative domain-specific embeddings to calculate the Fr\'echet Audio Distance (FAD) metric can help the FAD to correlate better with perceptual ratings of environmental sounds. We used embeddings from VGGish

Externí odkaz: http://arxiv.org/abs/2403.17508

Zobrazit plný text záznamu

Report

The Biased Journey of MSD_AUDIO.ZIP

Autor: Kim, Haven, Choi, Keunwoo, Modrzejewski, Mateusz, Liem, Cynthia C. S.

The equitable distribution of academic data is crucial for ensuring equal research opportunities, and ultimately further progress. Yet, due to the complexity of using the API for audio data that corresponds to the Million Song Dataset along with its

Externí odkaz: http://arxiv.org/abs/2308.16389

Zobrazit plný text záznamu

Report

LP-MusicCaps: LLM-Based Pseudo Music Captioning

Autor: Doh, SeungHeon, Choi, Keunwoo, Lee, Jongpil, Nam, Juhan

Automatic music captioning, which generates natural language descriptions for given music tracks, holds significant potential for enhancing the understanding and organization of large volumes of musical data. Despite its importance, researchers face

Externí odkaz: http://arxiv.org/abs/2307.16372

Zobrazit plný text záznamu

Report

HCLAS-X: Hierarchical and Cascaded Lyrics Alignment System Using Multimodal Cross-Correlation

Autor: Kang, Minsung, Park, Soochul, Choi, Keunwoo

In this work, we address the challenge of lyrics alignment, which involves aligning the lyrics and vocal components of songs. This problem requires the alignment of two distinct modalities, namely text and audio. To overcome this challenge, we propos

Externí odkaz: http://arxiv.org/abs/2307.04377

Zobrazit plný text záznamu

Report

A Demand-Driven Perspective on Generative Audio AI

Autor: Oh, Sangshin, Kang, Minsung, Moon, Hyeongi, Choi, Keunwoo, Chon, Ben Sangbae

To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and def

Externí odkaz: http://arxiv.org/abs/2307.04292

Zobrazit plný text záznamu

Report

Room Impulse Response Estimation in a Multiple Source Environment

Autor: Lee, Kyungyun, Seo, Jeonghun, Choi, Keunwoo, Lee, Sangmoon, Chon, Ben Sangbae

In real-world acoustic scenarios, there often are multiple sound sources present in a room. These sources are situated in various locations and produce sounds that reach the listener from multiple directions. The presence of multiple sources in a roo

Externí odkaz: http://arxiv.org/abs/2305.15898

Zobrazit plný text záznamu

Report

Foley Sound Synthesis at the DCASE 2023 Challenge

Autor: Choi, Keunwoo, Im, Jaekwon, Heller, Laurie, McFee, Brian, Imoto, Keisuke, Okamoto, Yuki, Lagrange, Mathieu, Takamichi, Shinosuke

The addition of Foley sound effects during post-production is a common technique used to enhance the perceived acoustic properties of multimedia content. Traditionally, Foley sound has been produced by human Foley artists, which involves manual recor

Externí odkaz: http://arxiv.org/abs/2304.12521

Zobrazit plný text záznamu

Report

Textless Speech-to-Music Retrieval Using Emotion Similarity

Autor: Doh, SeungHeon, Won, Minz, Choi, Keunwoo, Nam, Juhan

We introduce a framework that recommends music based on the emotions of speech. In content creation and daily life, speech contains information about human emotions, which can be enhanced by music. Our framework focuses on a cross-domain retrieval sy

Externí odkaz: http://arxiv.org/abs/2303.10539

Zobrazit plný text záznamu

Report

Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training

Autor: Cheuk, Kin Wai, Choi, Keunwoo, Kong, Qiuqiang, Li, Bochen, Won, Minz, Wang, Ju-Chiang, Hung, Yun-Ning, Herremans, Dorien

In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of an instrument recognition module t

Externí odkaz: http://arxiv.org/abs/2302.00286

Zobrazit plný text záznamu

Report

Toward Universal Text-to-Music Retrieval

Autor: Doh, SeungHeon, Won, Minz, Choi, Keunwoo, Nam, Juhan

This paper introduces effective design choices for text-to-music retrieval systems. An ideal text-based retrieval system would support various input queries such as pre-defined tags, unseen tags, and sentence-level descriptions. In reality, most prev

Externí odkaz: http://arxiv.org/abs/2211.14558

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání