Výsledky vyhledávání - "Johnson, Mark P."

Report

Using fractal dimension to predict the risk of intra cranial aneurysm rupture with machine learning

Autor: Elavarthi, Pradyumna, Ralescu, Anca, Johnson, Mark D., Prestigiacomo, Charles J.

Intracranial aneurysms (IAs) that rupture result in significant morbidity and mortality. While traditional risk models such as the PHASES score are useful in clinical decision making, machine learning (ML) models offer the potential to provide more a

Externí odkaz: http://arxiv.org/abs/2410.00121

Zobrazit plný text záznamu

Report

Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility

Autor: Zheng, Xiuwen, Phukon, Bornali, Hasegawa-Johnson, Mark

Publikováno v: Proceedings of Interspeech 2024

This paper enhances dysarthric and dysphonic speech recognition by fine-tuning pretrained automatic speech recognition (ASR) models on the 2023-10-05 data package of the Speech Accessibility Project (SAP), which contains the speech of 253 people with

Externí odkaz: http://arxiv.org/abs/2409.19818

Zobrazit plný text záznamu

Report

Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue

Autor: Wu, Junkai, Fan, Xulin, Lu, Bo-Ru, Jiang, Xilin, Mesgarani, Nima, Hasegawa-Johnson, Mark, Ostendorf, Mari

In recent years, we have observed a rapid advancement in speech language models (SpeechLLMs), catching up with humans' listening and reasoning abilities. SpeechLLMs have demonstrated impressive spoken dialog question-answering (SQA) performance in be

Externí odkaz: http://arxiv.org/abs/2409.04927

Zobrazit plný text záznamu

Report

A Language-agnostic Model of Child Language Acquisition

Autor: Mahon, Louis, Abend, Omri, Berger, Uri, Demuth, Katherine, Johnson, Mark, Steedman, Mark

This work reimplements a recent semantic bootstrapping child-language acquisition model, which was originally designed for English, and trains it to learn a new language: Hebrew. The model learns from pairs of utterances and logical forms as meaning

Externí odkaz: http://arxiv.org/abs/2408.12254

Zobrazit plný text záznamu

Report

LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition

Autor: Yoon, Eunseop, Yoon, Hee Suk, Harvill, John, Hasegawa-Johnson, Mark, Yoo, Chang D.

Test-Time Adaptation (TTA) has emerged as a crucial solution to the domain shift challenge, wherein the target environment diverges from the original training environment. A prime exemplification is TTA for Automatic Speech Recognition (ASR), which e

Externí odkaz: http://arxiv.org/abs/2408.05769

Zobrazit plný text záznamu

Report

TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback

Autor: Yoon, Eunseop, Yoon, Hee Suk, Eom, SooHwan, Han, Gunsoo, Nam, Daniel Wontae, Jo, Daejin, On, Kyoung-Woon, Hasegawa-Johnson, Mark A., Kim, Sungwoong, Yoo, Chang D.

Reinforcement Learning from Human Feedback (RLHF) leverages human preference data to train language models to align more closely with human essence. These human preference data, however, are labeled at the sequence level, creating a mismatch between

Externí odkaz: http://arxiv.org/abs/2407.16574

Zobrazit plný text záznamu

Report

Sound Tagging in Infant-centric Home Soundscapes

Autor: Khan, Mohammad Nur Hossain, Li, Jialu, McElwain, Nancy L., Hasegawa-Johnson, Mark, Islam, Bashima

Certain environmental noises have been associated with negative developmental outcomes for infants and young children. Though classifying or tagging sound events in a domestic environment is an active research area, previous studies focused on data c

Externí odkaz: http://arxiv.org/abs/2406.17190

Zobrazit plný text záznamu

Report

Towards Unsupervised Speech Recognition Without Pronunciation Models

Autor: Ni, Junrui, Wang, Liming, Zhang, Yang, Qian, Kaizhi, Gao, Heting, Hasegawa-Johnson, Mark, Yoo, Chang D.

Recent advancements in supervised automatic speech recognition (ASR) have achieved remarkable performance, largely due to the growing availability of large transcribed speech corpora. However, most languages lack sufficient paired speech and text dat

Externí odkaz: http://arxiv.org/abs/2406.08380

Zobrazit plný text záznamu

Report

C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion

Autor: Yoon, Hee Suk, Yoon, Eunseop, Tee, Joshua Tian Jin, Hasegawa-Johnson, Mark, Li, Yingzhen, Yoo, Chang D.

In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data. A prime exemplification is the recently proposed test-time prompt tuning for large-scale vision-language models such as C

Externí odkaz: http://arxiv.org/abs/2403.14119

Zobrazit plný text záznamu

Report

AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition

Autor: Eom, SooHwan, Yoon, Eunseop, Yoon, Hee Suk, Kim, Chanwoo, Hasegawa-Johnson, Mark, Yoo, Chang D.

In Automatic Speech Recognition (ASR) systems, a recurring obstacle is the generation of narrowly focused output distributions. This phenomenon emerges as a side effect of Connectionist Temporal Classification (CTC), a robust sequence learning tool t

Externí odkaz: http://arxiv.org/abs/2403.11578

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání