Zobrazeno 1 - 10
of 1 996
pro vyhledávání: '"Essid A"'
Autor:
Perera, David, Letzelter, Victor, Mariotte, Théo, Cortés, Adrien, Chen, Mickael, Essid, Slim, Richard, Gaël
We introduce Annealed Multiple Choice Learning (aMCL) which combines simulated annealing with MCL. MCL is a learning framework handling ambiguous tasks by predicting a small set of plausible hypotheses. These hypotheses are trained using the Winner-t
Externí odkaz:
http://arxiv.org/abs/2407.15580
Despite being trained on massive and diverse datasets, speech self-supervised encoders are generally used for downstream purposes as mere frozen feature extractors or model initializers before fine-tuning. The former severely limits the exploitation
Externí odkaz:
http://arxiv.org/abs/2407.00756
Autor:
Letzelter, Victor, Perera, David, Rommel, Cédric, Fontaine, Mathieu, Essid, Slim, Richard, Gael, Pérez, Patrick
Winner-takes-all training is a simple learning paradigm, which handles ambiguous tasks by predicting a set of plausible hypotheses. Recently, a connection was established between Winner-takes-all training and centroidal Voronoi tessellations, showing
Externí odkaz:
http://arxiv.org/abs/2406.04706
Publikováno v:
ICASSP, Apr 2024, Seoul (Korea), South Korea
Isolating the desired speaker's voice amidst multiplespeakers in a noisy acoustic context is a challenging task. Per-sonalized speech enhancement (PSE) endeavours to achievethis by leveraging prior knowledge of the speaker's voice.Recent research eff
Externí odkaz:
http://arxiv.org/abs/2404.08022
Publikováno v:
IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr 2024, Seoul (Korea), South Korea
Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with realistic dat
Externí odkaz:
http://arxiv.org/abs/2402.00067
Current state-of-the-art audio analysis systems rely on pre-trained embedding models, often used off-the-shelf as (frozen) feature extractors. Choosing the best one for a set of tasks is the subject of many recent publications. However, one aspect of
Externí odkaz:
http://arxiv.org/abs/2312.14005
Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Rand
Externí odkaz:
http://arxiv.org/abs/2312.09788
Autor:
Letzelter, Victor, Fontaine, Mathieu, Chen, Mickaël, Pérez, Patrick, Essid, Slim, Richard, Gaël
Publikováno v:
Advances in neural information processing systems, Dec 2023, New Orleans, United States
We introduce Resilient Multiple Choice Learning (rMCL), an extension of the MCL approach for conditional distribution estimation in regression settings where multiple targets may be sampled for each training input. Multiple Choice Learning is a simpl
Externí odkaz:
http://arxiv.org/abs/2311.01052
Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data. The high number of proposed approaches fostered the emergence of comprehensive benchmarks that evaluat
Externí odkaz:
http://arxiv.org/abs/2308.14456
Speech enhancement in ad-hoc microphone arrays is often hindered by the asynchronization of the devices composing the microphone array. Asynchronization comes from sampling time offset and sampling rate offset which inevitably occur when the micropho
Externí odkaz:
http://arxiv.org/abs/2307.16582