Výsledky vyhledávání

Report

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Autor: Ranzinger, Mike, Barker, Jon, Heinrich, Greg, Molchanov, Pavlo, Catanzaro, Bryan, Tao, Andrew

Various visual foundation models have distinct strengths and weaknesses, both of which can be improved through heterogeneous multi-teacher knowledge distillation without labels, termed "agglomerative models." We build upon this body of work by studyi

Externí odkaz: http://arxiv.org/abs/2410.01680

Zobrazit plný text záznamu

Report

NVLM: Open Frontier-Class Multimodal LLMs

Autor: Dai, Wenliang, Lee, Nayeon, Wang, Boxin, Yang, Zhuoling, Liu, Zihan, Barker, Jon, Rintamaki, Tuomas, Shoeybi, Mohammad, Catanzaro, Bryan, Ping, Wei

We introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 4

Externí odkaz: http://arxiv.org/abs/2409.11402

Zobrazit plný text záznamu

Report

The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss

Autor: Dabike, Gerardo Roa, Akeroyd, Michael A., Bannister, Scott, Barker, Jon P., Cox, Trevor J., Fazenda, Bruno, Firth, Jennifer, Graetzer, Simone, Greasley, Alinka, Vos, Rebecca R., Whitmer, William M.

It is well established that listening to music is an issue for those with hearing loss, and hearing aids are not a universal solution. How can machine learning be used to address this? This paper details the first application of the open challenge me

Externí odkaz: http://arxiv.org/abs/2409.05095

Zobrazit plný text záznamu

Report

Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement

Autor: Sutherland, Robert, Close, George, Hain, Thomas, Goetze, Stefan, Barker, Jon

Machine learning techniques are an active area of research for speech enhancement for hearing aids, with one particular focus on improving the intelligibility of a noisy speech signal. Recent work has shown that feature encodings from self-supervised

Externí odkaz: http://arxiv.org/abs/2407.13333

Zobrazit plný text záznamu

Report

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge

Autor: Leglaive, Simon, Fraticelli, Matthieu, ElGhazaly, Hend, Borne, Léonie, Sadeghi, Mostafa, Wisdom, Scott, Pariente, Manuel, Hershey, John R., Pressnitzer, Daniel, Barker, Jon P.

Supervised models for speech enhancement are trained using artificially generated mixtures of clean speech and noise signals. However, the synthetic training conditions may not accurately reflect real-world conditions encountered during testing. This

Externí odkaz: http://arxiv.org/abs/2402.01413

Zobrazit plný text záznamu

Report

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

Autor: Mogridge, Rhiannon, Close, George, Sutherland, Robert, Hain, Thomas, Barker, Jon, Goetze, Stefan, Ragni, Anton

Neural networks have been successfully used for non-intrusive speech intelligibility prediction. Recently, the use of feature representations sourced from intermediate layers of pre-trained self-supervised and weakly-supervised models has been found

Externí odkaz: http://arxiv.org/abs/2401.13611

Zobrazit plný text záznamu

Report

Overview Of The 2023 Icassp Sp Clarity Challenge: Speech Enhancement For Hearing Aids

Autor: Cox, Trevor J., Barker, Jon, Bailey, Will, Graetzer, Simone, Akeroyd, Michael A., Culling, John F., Naylor, Graham

This paper reports on the design and outcomes of the ICASSP SP Clarity Challenge: Speech Enhancement for Hearing Aids. The scenario was a listener attending to a target speaker in a noisy, domestic environment. There were multiple interferers and hea

Externí odkaz: http://arxiv.org/abs/2311.14490

Zobrazit plný text záznamu

Report

Intelligibility prediction with a pretrained noise-robust automatic speech recognition model

Autor: Tu, Zehai, Ma, Ning, Barker, Jon

This paper describes two intelligibility prediction systems derived from a pretrained noise-robust automatic speech recognition (ASR) model for the second Clarity Prediction Challenge (CPC2). One system is intrusive and leverages the hidden represent

Externí odkaz: http://arxiv.org/abs/2310.19817

Zobrazit plný text záznamu

Report

The First Cadenza Signal Processing Challenge: Improving Music for Those With a Hearing Loss

Autor: Dabike, Gerardo Roa, Bannister, Scott, Firth, Jennifer, Graetzer, Simone, Vos, Rebecca, Akeroyd, Michael A., Barker, Jon, Cox, Trevor J., Fazenda, Bruno, Greasley, Alinka, Whitmer, William

The Cadenza project aims to improve the audio quality of music for those who have a hearing loss. This is being done through a series of signal processing challenges, to foster better and more inclusive technologies. In the first round, two common li

Externí odkaz: http://arxiv.org/abs/2310.05799

Zobrazit plný text záznamu

Report

The ICASSP SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids

Autor: Dabike, Gerardo Roa, Akeroyd, Michael A., Bannister, Scott, Barker, Jon, Cox, Trevor J., Fazenda, Bruno, Firth, Jennifer, Graetzer, Simone, Greasley, Alinka, Vos, Rebecca R., Whitmer, William M.

This paper reports on the design and results of the 2024 ICASSP SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids. The Cadenza project is working to enhance the audio quality of music for those with a hearing loss. The scenario for the c

Externí odkaz: http://arxiv.org/abs/2310.03480

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání