Zobrazeno 1 - 10
of 5 143
pro vyhledávání: '"Johnson, Mark P."'
Intracranial aneurysms (IAs) that rupture result in significant morbidity and mortality. While traditional risk models such as the PHASES score are useful in clinical decision making, machine learning (ML) models offer the potential to provide more a
Externí odkaz:
http://arxiv.org/abs/2410.00121
Publikováno v:
Proceedings of Interspeech 2024
This paper enhances dysarthric and dysphonic speech recognition by fine-tuning pretrained automatic speech recognition (ASR) models on the 2023-10-05 data package of the Speech Accessibility Project (SAP), which contains the speech of 253 people with
Externí odkaz:
http://arxiv.org/abs/2409.19818
Autor:
Wu, Junkai, Fan, Xulin, Lu, Bo-Ru, Jiang, Xilin, Mesgarani, Nima, Hasegawa-Johnson, Mark, Ostendorf, Mari
In recent years, we have observed a rapid advancement in speech language models (SpeechLLMs), catching up with humans' listening and reasoning abilities. SpeechLLMs have demonstrated impressive spoken dialog question-answering (SQA) performance in be
Externí odkaz:
http://arxiv.org/abs/2409.04927
This work reimplements a recent semantic bootstrapping child-language acquisition model, which was originally designed for English, and trains it to learn a new language: Hebrew. The model learns from pairs of utterances and logical forms as meaning
Externí odkaz:
http://arxiv.org/abs/2408.12254
Test-Time Adaptation (TTA) has emerged as a crucial solution to the domain shift challenge, wherein the target environment diverges from the original training environment. A prime exemplification is TTA for Automatic Speech Recognition (ASR), which e
Externí odkaz:
http://arxiv.org/abs/2408.05769
Autor:
Yoon, Eunseop, Yoon, Hee Suk, Eom, SooHwan, Han, Gunsoo, Nam, Daniel Wontae, Jo, Daejin, On, Kyoung-Woon, Hasegawa-Johnson, Mark A., Kim, Sungwoong, Yoo, Chang D.
Reinforcement Learning from Human Feedback (RLHF) leverages human preference data to train language models to align more closely with human essence. These human preference data, however, are labeled at the sequence level, creating a mismatch between
Externí odkaz:
http://arxiv.org/abs/2407.16574
Autor:
Khan, Mohammad Nur Hossain, Li, Jialu, McElwain, Nancy L., Hasegawa-Johnson, Mark, Islam, Bashima
Certain environmental noises have been associated with negative developmental outcomes for infants and young children. Though classifying or tagging sound events in a domestic environment is an active research area, previous studies focused on data c
Externí odkaz:
http://arxiv.org/abs/2406.17190
Autor:
Ni, Junrui, Wang, Liming, Zhang, Yang, Qian, Kaizhi, Gao, Heting, Hasegawa-Johnson, Mark, Yoo, Chang D.
Recent advancements in supervised automatic speech recognition (ASR) have achieved remarkable performance, largely due to the growing availability of large transcribed speech corpora. However, most languages lack sufficient paired speech and text dat
Externí odkaz:
http://arxiv.org/abs/2406.08380
Autor:
Yoon, Hee Suk, Yoon, Eunseop, Tee, Joshua Tian Jin, Hasegawa-Johnson, Mark, Li, Yingzhen, Yoo, Chang D.
In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data. A prime exemplification is the recently proposed test-time prompt tuning for large-scale vision-language models such as C
Externí odkaz:
http://arxiv.org/abs/2403.14119
Autor:
Eom, SooHwan, Yoon, Eunseop, Yoon, Hee Suk, Kim, Chanwoo, Hasegawa-Johnson, Mark, Yoo, Chang D.
In Automatic Speech Recognition (ASR) systems, a recurring obstacle is the generation of narrowly focused output distributions. This phenomenon emerges as a side effect of Connectionist Temporal Classification (CTC), a robust sequence learning tool t
Externí odkaz:
http://arxiv.org/abs/2403.11578