Výsledky vyhledávání - "Fontaine, Mathieu"

Report

Speech dereverberation constrained on room impulse response characteristics

Autor: Bahrman, Louis, Fontaine, Mathieu, Roux, Jonathan Le, Richard, Gaël

Publikováno v: INTERSPEECH, Sep 2024, Kos Island, Greece

Single-channel speech dereverberation aims at extracting a dry speech signal from a recording affected by the acoustic reflections in a room. However, most current deep learning-based approaches for speech dereverberation are not interpretable for ro

Externí odkaz: http://arxiv.org/abs/2407.08657

Zobrazit plný text záznamu

Report

Winner-takes-all learners are geometry-aware conditional density estimators

Autor: Letzelter, Victor, Perera, David, Rommel, Cédric, Fontaine, Mathieu, Essid, Slim, Richard, Gael, Pérez, Patrick

Winner-takes-all training is a simple learning paradigm, which handles ambiguous tasks by predicting a set of plausible hypotheses. Recently, a connection was established between Winner-takes-all training and centroidal Voronoi tessellations, showing

Externí odkaz: http://arxiv.org/abs/2406.04706

Zobrazit plný text záznamu

Report

A lightweight dual-stage framework for personalized speech enhancement based on DeepFilterNet2

Autor: Serre, Thomas, Fontaine, Mathieu, Benhaim, Éric, Dutour, Geoffroy, Essid, Slim

Publikováno v: ICASSP, Apr 2024, Seoul (Korea), South Korea

Isolating the desired speaker's voice amidst multiplespeakers in a noisy acoustic context is a challenging task. Per-sonalized speech enhancement (PSE) endeavours to achievethis by leveraging prior knowledge of the speaker's voice.Recent research eff

Externí odkaz: http://arxiv.org/abs/2404.08022

Zobrazit plný text záznamu

Report

GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model

Autor: Liu, Haocheng, Baoueb, Teysir, Fontaine, Mathieu, Roux, Jonathan Le, Richard, Gael

Publikováno v: IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2024, Seoul (Korea), South Korea

Diffusion models are receiving a growing interest for a variety of signal generation tasks such as speech or music synthesis. WaveGrad, for example, is a successful diffusion model that conditionally uses the mel spectrogram to guide a diffusion proc

Externí odkaz: http://arxiv.org/abs/2402.15516

Zobrazit plný text záznamu

Report

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis

Autor: Baoueb, Teysir, Liu, Haocheng, Fontaine, Mathieu, Roux, Jonathan Le, Richard, Gael

Publikováno v: IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2024, Seoul (Korea), South Korea

Generative adversarial network (GAN) models can synthesize highquality audio signals while ensuring fast sample generation. However, they are difficult to train and are prone to several issues including mode collapse and divergence. In this paper, we

Externí odkaz: http://arxiv.org/abs/2402.01753

Zobrazit plný text záznamu

Report

Online speaker diarization of meetings guided by speech separation

Autor: Gruttadauria, Elio, Fontaine, Mathieu, Essid, Slim

Publikováno v: IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr 2024, Seoul (Korea), South Korea

Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with realistic dat

Externí odkaz: http://arxiv.org/abs/2402.00067

Zobrazit plný text záznamu

Report

Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis

Autor: Letzelter, Victor, Fontaine, Mathieu, Chen, Mickaël, Pérez, Patrick, Essid, Slim, Richard, Gaël

Publikováno v: Advances in neural information processing systems, Dec 2023, New Orleans, United States

We introduce Resilient Multiple Choice Learning (rMCL), an extension of the MCL approach for conditional distribution estimation in regression settings where multiple targets may be sampled for each training input. Multiple Choice Learning is a simpl

Externí odkaz: http://arxiv.org/abs/2311.01052

Zobrazit plný text záznamu

Report

Neural Steerer: Novel Steering Vector Synthesis with a Causal Neural Field over Frequency and Source Positions

Autor: Di Carlo, Diego, Nugraha, Aditya Arie, Fontaine, Mathieu, Yoshii, Kazuyoshi

We address the problem of accurately interpolating measured anechoic steering vectors with a deep learning framework called the neural field. This task plays a pivotal role in reducing the resource-intensive measurements required for precise sound so

Externí odkaz: http://arxiv.org/abs/2305.04447

Zobrazit plný text záznamu

Report

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

Autor: Nugraha, Aditya Arie, Sekiguchi, Kouhei, Fontaine, Mathieu, Bando, Yoshiaki, Yoshii, Kazuyoshi

This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum variance distorti

Externí odkaz: http://arxiv.org/abs/2207.10934

Zobrazit plný text záznamu

Report

Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments

Autor: Sekiguchi, Kouhei, Nugraha, Aditya Arie, Du, Yicheng, Bando, Yoshiaki, Fontaine, Mathieu, Yoshii, Kazuyoshi

This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e.g., cocktail party)

Externí odkaz: http://arxiv.org/abs/2207.07296

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání