Výsledky vyhledávání - "Elhilali, Mounya"

Report

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Autor: Hai, Jiarui, Xu, Yong, Zhang, Hao, Li, Chenxing, Wang, Helin, Elhilali, Mounya, Yu, Dong

Latent diffusion models have shown promising results in text-to-audio (T2A) generation tasks, yet previous models have encountered difficulties in generation quality, computational cost, diffusion sampling, and data preparation. In this paper, we int

Externí odkaz: http://arxiv.org/abs/2409.10819

Zobrazit plný text záznamu

Report

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer

Autor: Wang, Helin, Hai, Jiarui, Lu, Yen-Ju, Thakkar, Karan, Elhilali, Mounya, Dehak, Najim

In this paper, we introduce SoloAudio, a novel diffusion-based generative model for target sound extraction (TSE). Our approach trains latent diffusion models on audio, replacing the previous U-Net backbone with a skip-connected Transformer that oper

Externí odkaz: http://arxiv.org/abs/2409.08425

Zobrazit plný text záznamu

Report

DreamVoice: Text-Guided Voice Conversion

Autor: Hai, Jiarui, Thakkar, Karan, Wang, Helin, Qin, Zengyi, Elhilali, Mounya

Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot voice conversion (VC) requires a target recording during inference, limiting ease of usage in generating

Externí odkaz: http://arxiv.org/abs/2406.16314

Zobrazit plný text záznamu

Report

Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding

Autor: Thakkar, Karan, Hai, Jiarui, Elhilali, Mounya

Auditory Attention Decoding (AAD) algorithms play a crucial role in isolating desired sound sources within challenging acoustic environments directly from brain activity. Although recent research has shown promise in AAD using shallow representations

Externí odkaz: http://arxiv.org/abs/2311.00814

Zobrazit plný text záznamu

Report

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

Autor: Hai, Jiarui, Wang, Helin, Yang, Dongchao, Thakkar, Karan, Dehak, Najim, Elhilali, Mounya

Common target sound extraction (TSE) approaches primarily relied on discriminative approaches in order to separate the target sound while minimizing interference from the unwanted sources, with varying success in separating the target from the backgr

Externí odkaz: http://arxiv.org/abs/2310.04567

Zobrazit plný text záznamu

Report

Cross-Referencing Self-Training Network for Sound Event Detection in Audio Mixtures

Autor: Park, Sangwook, Han, David K., Elhilali, Mounya

Sound event detection is an important facet of audio tagging that aims to identify sounds of interest and define both the sound category and time boundaries for each sound event in a continuous recording. With advances in deep neural networks, there

Externí odkaz: http://arxiv.org/abs/2105.13392

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Reference free auscultation quality metric and its trends

Autor: Kala, Annapurna, McCollum, Eric D., Elhilali, Mounya

Publikováno v: In Biomedical Signal Processing and Control August 2023 85

Zobrazit plný text záznamu

Report

Joint Acoustic and Class Inference for Weakly Supervised Sound Event Detection

Autor: Kothinti, Sandeep, Imoto, Keisuke, Chakrabarty, Debmalya, Sell, Gregory, Watanabe, Shinji, Elhilali, Mounya

Sound event detection is a challenging task, especially for scenes with multiple simultaneous events. While event classification methods tend to be fairly accurate, event localization presents additional challenges, especially when large amounts of l

Externí odkaz: http://arxiv.org/abs/1811.04048

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání