Zobrazeno 1 - 10
of 46 032
pro vyhledávání: '"Philip, C."'
This paper introduces a novel approach to speaker-attributed ASR transcription using a neural clustering method. With a parallel processing mechanism, diarisation and ASR can be applied simultaneously, helping to prevent the accumulation of errors fr
Externí odkaz:
http://arxiv.org/abs/2407.02007
We develop new approximate compilation schemes that significantly reduce the expense of compiling the Quantum Approximate Optimization Algorithm (QAOA) for solving the Max-Cut problem. Our main focus is on compilation with trapped-ion simulators usin
Externí odkaz:
http://arxiv.org/abs/2406.14330
Autor:
Sadavoy, Sarah I, Sheehan, Patrick, Tobin, John J., Murillo, Nadia M., Teague, Richard, Stephens, Ian, Henning, Thomas, Myers, Philip C., Bergin, Edwin A.
We present ALMA Band 7 molecular line observations of the protostars within the VLA 1623 system. We map C$^{17}$O (3 - 2) in the circumbinary disk around VLA 1623A and the outflow cavity walls of the collimated outflow. We further detect red-shifted
Externí odkaz:
http://arxiv.org/abs/2406.12984
Autor:
Deng, Keqi, Woodland, Philip C.
While the neural transducer is popular for online speech recognition, simultaneous speech translation (SST) requires both streaming and re-ordering capabilities. This paper presents the LS-Transducer-SST, a label-synchronous neural transducer for SST
Externí odkaz:
http://arxiv.org/abs/2406.04541
Wav2Prompt is proposed which allows straightforward integration between spoken input and a text-based large language model (LLM). Wav2Prompt uses a simple training process with only the same data used to train an automatic speech recognition (ASR) mo
Externí odkaz:
http://arxiv.org/abs/2406.00522
Autor:
Chen, Mingjie, Zhang, Hezhao, Li, Yuanchao, Luo, Jiachen, Wu, Wen, Ma, Ziyang, Bell, Peter, Lai, Catherine, Reiss, Joshua, Wang, Lin, Woodland, Philip C., Chen, Xie, Phan, Huy, Hain, Thomas
Speech emotion recognition is a challenging classification task with natural emotional speech, especially when the distribution of emotion types is imbalanced in the training and test data. In this case, it is more difficult for a model to learn to s
Externí odkaz:
http://arxiv.org/abs/2405.20064
We show that scale-invariant special Kahler geometries whose generic r-complex-dimensional abelian variety fiber is isomorphic (completely split) or isogenous (completely iso-split) as a complex torus to the product of r one-dimensional complex tori
Externí odkaz:
http://arxiv.org/abs/2405.19395
Autor:
van Bohemen, Samuel J, Rogers, Jeffrey M, Alavanja, Aleksandra, Evans, Andrew, Young, Noel, Boughton, Philip C, Valderrama, Joaquin, Kyme, Andre Z
This study assessed the safety, feasibility, and acceptability of a novel device to monitor ischaemic stroke patients. The device captured electroencephalography (EEG) and electrocardiography (ECG) data to compute an ECG-based metric termed the Elect
Externí odkaz:
http://arxiv.org/abs/2403.17362
Autor:
Wu, Wen, Li, Bo, Zhang, Chao, Chiu, Chung-Cheng, Li, Qiujia, Bai, Junwen, Sainath, Tara N., Woodland, Philip C.
The subjective perception of emotion leads to inconsistent labels from human annotators. Typically, utterances lacking majority-agreed labels are excluded when training an emotion classifier, which cause problems when encountering ambiguous emotional
Externí odkaz:
http://arxiv.org/abs/2402.12862
Publikováno v:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 10986-10990
Foundation models have shown superior performance for speech emotion recognition (SER). However, given the limited data in emotion corpora, finetuning all parameters of large pre-trained models for SER can be both resource-intensive and susceptible t
Externí odkaz:
http://arxiv.org/abs/2402.11747