Výsledky vyhledávání

Report

SOT Triggered Neural Clustering for Speaker Attributed ASR

Autor: Zheng, Xianrui, Sun, Guangzhi, Zhang, Chao, Woodland, Philip C.

This paper introduces a novel approach to speaker-attributed ASR transcription using a neural clustering method. With a parallel processing mechanism, diarisation and ASR can be applied simultaneously, helping to prevent the accumulation of errors fr

Externí odkaz: http://arxiv.org/abs/2407.02007

Zobrazit plný text záznamu

Report

Promise of Graph Sparsification and Decomposition for Noise Reduction in QAOA: Analysis for Trapped-Ion Compilations

Autor: Moondra, Jai, Lotshaw, Philip C., Mohler, Greg, Gupta, Swati

We develop new approximate compilation schemes that significantly reduce the expense of compiling the Quantum Approximate Optimization Algorithm (QAOA) for solving the Max-Cut problem. Our main focus is on compilation with trapped-ion simulators usin

Externí odkaz: http://arxiv.org/abs/2406.14330

Zobrazit plný text záznamu

Report

Constraining the Stellar Masses and Origin of the Protostellar VLA 1623 System

Autor: Sadavoy, Sarah I, Sheehan, Patrick, Tobin, John J., Murillo, Nadia M., Teague, Richard, Stephens, Ian, Henning, Thomas, Myers, Philip C., Bergin, Edwin A.

We present ALMA Band 7 molecular line observations of the protostars within the VLA 1623 system. We map C$^{17}$O (3 - 2) in the circumbinary disk around VLA 1623A and the outflow cavity walls of the collimated outflow. We further detect red-shifted

Externí odkaz: http://arxiv.org/abs/2406.12984

Zobrazit plný text záznamu

Report

Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation

Autor: Deng, Keqi, Woodland, Philip C.

While the neural transducer is popular for online speech recognition, simultaneous speech translation (SST) requires both streaming and re-ordering capabilities. This paper presents the LS-Transducer-SST, a label-synchronous neural transducer for SST

Externí odkaz: http://arxiv.org/abs/2406.04541

Zobrazit plný text záznamu

Report

Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning

Autor: Deng, Keqi, Sun, Guangzhi, Woodland, Philip C.

Wav2Prompt is proposed which allows straightforward integration between spoken input and a text-based large language model (LLM). Wav2Prompt uses a simple training process with only the same data used to train an automatic speech recognition (ASR) mo

Externí odkaz: http://arxiv.org/abs/2406.00522

Zobrazit plný text záznamu

Report

1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

Autor: Chen, Mingjie, Zhang, Hezhao, Li, Yuanchao, Luo, Jiachen, Wu, Wen, Ma, Ziyang, Bell, Peter, Lai, Catherine, Reiss, Joshua, Wang, Lin, Woodland, Philip C., Chen, Xie, Phan, Huy, Hain, Thomas

Speech emotion recognition is a challenging classification task with natural emotional speech, especially when the distribution of emotion types is imbalanced in the training and test data. In this case, it is more difficult for a model to learn to s

Externí odkaz: http://arxiv.org/abs/2405.20064

Zobrazit plný text záznamu

Report

Completely (iso-)split scale-invariant Coulomb branch geometries are isotrivial

Autor: Argyres, Philip C., Moscrop, Robert, Thakur, Souradeep, Weaver, Mitch

We show that scale-invariant special Kahler geometries whose generic r-complex-dimensional abelian variety fiber is isomorphic (completely split) or isogenous (completely iso-split) as a complex torus to the product of r one-dimensional complex tori

Externí odkaz: http://arxiv.org/abs/2405.19395

Zobrazit plný text záznamu

Report

Safety, feasibility, and acceptability of a novel device to monitor ischaemic stroke patients

Autor: van Bohemen, Samuel J, Rogers, Jeffrey M, Alavanja, Aleksandra, Evans, Andrew, Young, Noel, Boughton, Philip C, Valderrama, Joaquin, Kyme, Andre Z

This study assessed the safety, feasibility, and acceptability of a novel device to monitor ischaemic stroke patients. The device captured electroencephalography (EEG) and electrocardiography (ECG) data to compute an ECG-based metric termed the Elect

Externí odkaz: http://arxiv.org/abs/2403.17362

Zobrazit plný text záznamu

Report

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

Autor: Wu, Wen, Li, Bo, Zhang, Chao, Chiu, Chung-Cheng, Li, Qiujia, Bai, Junwen, Sainath, Tara N., Woodland, Philip C.

The subjective perception of emotion leads to inconsistent labels from human annotators. Typically, utterances lacking majority-agreed labels are excluded when training an emotion classifier, which cause problems when encountering ambiguous emotional

Externí odkaz: http://arxiv.org/abs/2402.12862

Zobrazit plný text záznamu

Report

Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation

Autor: Lashkarashvili, Nineli, Wu, Wen, Sun, Guangzhi, Woodland, Philip C.

Publikováno v: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 10986-10990

Foundation models have shown superior performance for speech emotion recognition (SER). However, given the limited data in emotion corpora, finetuning all parameters of large pre-trained models for SER can be both resource-intensive and susceptible t

Externí odkaz: http://arxiv.org/abs/2402.11747

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání