Výsledky vyhledávání

Akademický článek

The Sound Demixing Challenge 2023 – Cinematic Demixing Track

Autor: Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

Publikováno v: Transactions of the International Society for Music Information Retrieval, Vol 7, Iss 1, Pp 44–62-44–62 (2024)

This paper summarizes the cinematic demixing (CDX) track of the Sound Demixing Challenge 2023 (SDX’23). We provide a comprehensive summary of the challenge setup, detailing the structure of the competition and the datasets used. Especially, we deta

Externí odkaz: https://doaj.org/article/4d5fad42081f45e48007f343da8811e3

Zobrazit plný text záznamu

Akademický článek

Elevated levels of FMRP-target MAP1B impair human and mouse neuronal development and mouse social behaviors via autophagy pathway

Autor: Yu Guo, Minjie Shen, Qiping Dong, Natasha M. Méndez-Albelo, Sabrina X. Huang, Carissa L. Sirois, Jonathan Le, Meng Li, Ezra D. Jarzembowski, Keegan A. Schoeller, Michael E. Stockton, Vanessa L. Horner, André M. M. Sousa, Yu Gao, Birth Defects Research Laboratory, Jon E. Levine, Daifeng Wang, Qiang Chang, Xinyu Zhao

Publikováno v: Nature Communications, Vol 14, Iss 1, Pp 1-23 (2023)

Abstract Fragile X messenger ribonucleoprotein 1 protein (FMRP) binds many mRNA targets in the brain. The contribution of these targets to fragile X syndrome (FXS) and related autism spectrum disorder (ASD) remains unclear. Here, we show that FMRP de

Externí odkaz: https://doaj.org/article/c712033063bc4873a8e01bd8ebf89faa

Zobrazit plný text záznamu

Report

Leveraging Audio-Only Data for Text-Queried Target Sound Extraction

Autor: Saijo, Kohei, Ebbers, Janek, Germain, François G., Khurana, Sameer, Wichern, Gordon, Roux, Jonathan Le

The goal of text-queried target sound extraction (TSE) is to extract from a mixture a sound source specified with a natural-language caption. While it is preferable to have access to large-scale text-audio pairs to address a variety of text prompts,

Externí odkaz: http://arxiv.org/abs/2409.13152

Zobrazit plný text záznamu

Report

Enhanced Reverberation as Supervision for Unsupervised Speech Separation

Autor: Saijo, Kohei, Wichern, Gordon, Germain, François G., Pan, Zexu, Roux, Jonathan Le

Reverberation as supervision (RAS) is a framework that allows for training monaural speech separation models from multi-channel mixtures in an unsupervised manner. In RAS, models are trained so that sources predicted from a mixture at an input channe

Externí odkaz: http://arxiv.org/abs/2408.03438

Zobrazit plný text záznamu

Report

TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

Autor: Saijo, Kohei, Wichern, Gordon, Germain, François G., Pan, Zexu, Roux, Jonathan Le

Time-frequency (TF) domain dual-path models achieve high-fidelity speech separation. While some previous state-of-the-art (SoTA) models rely on RNNs, this reliance means they lack the parallelizability, scalability, and versatility of Transformer blo

Externí odkaz: http://arxiv.org/abs/2408.03440

Zobrazit plný text záznamu

Report

Disentangled Acoustic Fields For Multimodal Physical Scene Understanding

Autor: Yin, Jie, Luo, Andrew, Du, Yilun, Cherian, Anoop, Marks, Tim K., Roux, Jonathan Le, Gan, Chuang

We study the problem of multimodal physical scene understanding, where an embodied agent needs to find fallen objects by inferring object properties, direction, and distance of an impact sound source. Previous works adopt feed-forward neural networks

Externí odkaz: http://arxiv.org/abs/2407.11333

Zobrazit plný text záznamu

Report

Speech dereverberation constrained on room impulse response characteristics

Autor: Bahrman, Louis, Fontaine, Mathieu, Roux, Jonathan Le, Richard, Gaël

Publikováno v: INTERSPEECH, Sep 2024, Kos Island, Greece

Single-channel speech dereverberation aims at extracting a dry speech signal from a recording affected by the acoustic reflections in a room. However, most current deep learning-based approaches for speech dereverberation are not interpretable for ro

Externí odkaz: http://arxiv.org/abs/2407.08657

Zobrazit plný text záznamu

Report

Sound Event Bounding Boxes

Autor: Ebbers, Janek, Germain, Francois G., Wichern, Gordon, Roux, Jonathan Le

Sound event detection is the task of recognizing sounds and determining their extent (onset/offset times) within an audio clip. Existing systems commonly predict sound presence confidence in short time frames. Then, thresholding produces binary frame

Externí odkaz: http://arxiv.org/abs/2406.04212

Zobrazit plný text záznamu

Report

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers

Autor: Koo, Junghyun, Wichern, Gordon, Germain, Francois G., Khurana, Sameer, Roux, Jonathan Le

We introduce Self-Monitored Inference-Time INtervention (SMITIN), an approach for controlling an autoregressive generative music transformer using classifier probes. These simple logistic regression probes are trained on the output of each attention

Externí odkaz: http://arxiv.org/abs/2404.02252

Zobrazit plný text záznamu

Report

Why does music source separation benefit from cacophony?

Autor: Jeon, Chang-Bin, Wichern, Gordon, Germain, François G., Roux, Jonathan Le

In music source separation, a standard training data augmentation procedure is to create new training samples by randomly combining instrument stems from different songs. These random mixes have mismatched characteristics compared to real music, e.g.

Externí odkaz: http://arxiv.org/abs/2402.18407

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání