Výsledky vyhledávání - "Chen, I.-Fan"

Report

Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

Autor: Yu, Yu, Yang, Chao-Han Huck, Kolehmainen, Jari, Shivakumar, Prashanth G., Gu, Yile, Ryu, Sungho, Ren, Roger, Luo, Qi, Gourav, Aditya, Chen, I-Fan, Liu, Yi-Chieh, Dinh, Tuan, Gandhe, Ankur, Filimonov, Denis, Ghosh, Shalini, Stolcke, Andreas, Rastow, Ariya, Bulyko, Ivan

Publikováno v: Proc. IEEE ASRU Workshop, Dec. 2023

We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computati

Externí odkaz: http://arxiv.org/abs/2309.15223

Zobrazit plný text záznamu

Report

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition

Autor: Yang, Chao-Han Huck, Chen, I-Fan, Stolcke, Andreas, Siniscalchi, Sabato Marco, Lee, Chin-Hui

Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data. Such a noise perturbation often results in a severe performance degradation in automatic

Externí odkaz: http://arxiv.org/abs/2210.05614

Zobrazit plný text záznamu

Report

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Autor: Dheram, Pranav, Ramakrishnan, Murugesan, Raju, Anirudh, Chen, I-Fan, King, Brian, Powell, Katherine, Saboowala, Melissa, Shetty, Karan, Stolcke, Andreas

Publikováno v: Proc. Interspeech, Sept. 2022, pp. 1268-1272

As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from

Externí odkaz: http://arxiv.org/abs/2207.11345

Zobrazit plný text záznamu

Report

Investigation of Training Label Error Impact on RNN-T

Autor: Chen, I-Fan, King, Brian, Droppo, Jasha

In this paper, we propose an approach to quantitatively analyze impacts of different training label errors to RNN-T based ASR models. The result shows deletion errors are more harmful than substitution and insertion label errors in RNN-T training dat

Externí odkaz: http://arxiv.org/abs/2112.00350

Zobrazit plný text záznamu

Report

Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition

Autor: Van Segbroeck, Maarten, Mallidih, Harish, King, Brian, Chen, I-Fan, Chadha, Gurpreet, Maas, Roland

Acoustic models in real-time speech recognition systems typically stack multiple unidirectional LSTM layers to process the acoustic frames over time. Performance improvements over vanilla LSTM architectures have been reported by prepending a stack of

Externí odkaz: http://arxiv.org/abs/2007.00131

Zobrazit plný text záznamu

Report

Data Techniques For Online End-to-end Speech Recognition

Autor: Chen, Yang, Wang, Weiran, Chen, I-Fan, Wang, Chao

Practitioners often need to build ASR systems for new use cases in a short amount of time, given limited in-domain data. While recently developed end-to-end methods largely simplify the modeling pipelines, they still suffer from the data sparsity iss

Externí odkaz: http://arxiv.org/abs/2001.09221

Zobrazit plný text záznamu

Report

End-to-end Anchored Speech Recognition

Autor: Wang, Yiming, Fan, Xing, Chen, I-Fan, Liu, Yuzong, Chen, Tongfei, Hoffmeister, Björn

Voice-controlled house-hold devices, like Amazon Echo or Google Home, face the problem of performing speech recognition of device-directed speech in the presence of interfering background speech, i.e., background noise and interfering speech from ano

Externí odkaz: http://arxiv.org/abs/1902.02383

Zobrazit plný text záznamu

Akademický článek

A Method for Generating Course Test Questions Based on Natural Language Processing and Deep Learning.

Autor: Wang, Hei-Chia, Chiang, Yu-Hung, Chen, I-Fan

Publikováno v: Education & Information Technologies; 2024, Vol. 29 Issue 7, p8843-8865, 23p

Zobrazit plný text záznamu

Report

Maximum a Posteriori Adaptation of Network Parameters in Deep Models

Autor: Huang, Zhen, Siniscalchi, Sabato Marco, Chen, I-Fan, Wu, Jiadong, Lee, Chin-Hui

We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance. Given an abundance of DNN parameters but with only a

Externí odkaz: http://arxiv.org/abs/1503.02108

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání