Zobrazeno 1 - 10
of 87
pro vyhledávání: '"Chen, I.-Fan"'
Autor:
Yu, Yu, Yang, Chao-Han Huck, Kolehmainen, Jari, Shivakumar, Prashanth G., Gu, Yile, Ryu, Sungho, Ren, Roger, Luo, Qi, Gourav, Aditya, Chen, I-Fan, Liu, Yi-Chieh, Dinh, Tuan, Gandhe, Ankur, Filimonov, Denis, Ghosh, Shalini, Stolcke, Andreas, Rastow, Ariya, Bulyko, Ivan
Publikováno v:
Proc. IEEE ASRU Workshop, Dec. 2023
We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computati
Externí odkaz:
http://arxiv.org/abs/2309.15223
Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data. Such a noise perturbation often results in a severe performance degradation in automatic
Externí odkaz:
http://arxiv.org/abs/2210.05614
Autor:
Dheram, Pranav, Ramakrishnan, Murugesan, Raju, Anirudh, Chen, I-Fan, King, Brian, Powell, Katherine, Saboowala, Melissa, Shetty, Karan, Stolcke, Andreas
Publikováno v:
Proc. Interspeech, Sept. 2022, pp. 1268-1272
As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from
Externí odkaz:
http://arxiv.org/abs/2207.11345
In this paper, we propose an approach to quantitatively analyze impacts of different training label errors to RNN-T based ASR models. The result shows deletion errors are more harmful than substitution and insertion label errors in RNN-T training dat
Externí odkaz:
http://arxiv.org/abs/2112.00350
Autor:
Van Segbroeck, Maarten, Mallidih, Harish, King, Brian, Chen, I-Fan, Chadha, Gurpreet, Maas, Roland
Acoustic models in real-time speech recognition systems typically stack multiple unidirectional LSTM layers to process the acoustic frames over time. Performance improvements over vanilla LSTM architectures have been reported by prepending a stack of
Externí odkaz:
http://arxiv.org/abs/2007.00131
Practitioners often need to build ASR systems for new use cases in a short amount of time, given limited in-domain data. While recently developed end-to-end methods largely simplify the modeling pipelines, they still suffer from the data sparsity iss
Externí odkaz:
http://arxiv.org/abs/2001.09221
Voice-controlled house-hold devices, like Amazon Echo or Google Home, face the problem of performing speech recognition of device-directed speech in the presence of interfering background speech, i.e., background noise and interfering speech from ano
Externí odkaz:
http://arxiv.org/abs/1902.02383
Publikováno v:
Education & Information Technologies; 2024, Vol. 29 Issue 7, p8843-8865, 23p
We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance. Given an abundance of DNN parameters but with only a
Externí odkaz:
http://arxiv.org/abs/1503.02108
Publikováno v:
Innovations in Education & Teaching International; Feb2024, Vol. 61 Issue 1, p143-153, 11p