Zobrazeno 1 - 10
of 52
pro vyhledávání: '"McGraw, Ian"'
Autor:
Hernandez, Steven M., Zhao, Ding, Ding, Shaojin, Bruguier, Antoine, Prabhavalkar, Rohit, Sainath, Tara N., He, Yanzhang, McGraw, Ian
Continued improvements in machine learning techniques offer exciting new opportunities through the use of larger models and larger training datasets. However, there is a growing need to offer these new capabilities on-board low-powered devices such a
Externí odkaz:
http://arxiv.org/abs/2303.08343
Autor:
Ding, Shaojin, Wang, Weiran, Zhao, Ding, Sainath, Tara N., He, Yanzhang, David, Robert, Botros, Rami, Wang, Xin, Panigrahy, Rina, Liang, Qiao, Hwang, Dongseong, McGraw, Ian, Prabhavalkar, Rohit, Strohman, Trevor
In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of qual
Externí odkaz:
http://arxiv.org/abs/2204.06164
Autor:
Ding, Shaojin, Rikhye, Rajeev, Liang, Qiao, He, Yanzhang, Wang, Quan, Narayanan, Arun, O'Malley, Tom, McGraw, Ian
Personalization of on-device speech recognition (ASR) has seen explosive growth in recent years, largely due to the increasing popularity of personal assistant features on mobile devices and smart home speakers. In this work, we present Personal VAD
Externí odkaz:
http://arxiv.org/abs/2204.03793
VoiceFilter-Lite is a speaker-conditioned voice separation model that plays a crucial role in improving speech recognition and speaker verification by suppressing overlapping speech from non-target speakers. However, one limitation of VoiceFilter-Lit
Externí odkaz:
http://arxiv.org/abs/2202.12169
Publikováno v:
In Injury November 2024 55(11)
In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass. This is achieved by using an attention mechanism on multiple speaker embeddi
Externí odkaz:
http://arxiv.org/abs/2107.01201
Autor:
Rikhye, Rajeev, Wang, Quan, Liang, Qiao, He, Yanzhang, Zhao, Ding, Yiteng, Huang, Narayanan, Arun, McGraw, Ian
In this paper, we introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary. The system is implemented with an end-to-end trained automatic speech recognit
Externí odkaz:
http://arxiv.org/abs/2104.13970
Confidence scores are very useful for downstream applications of automatic speech recognition (ASR) systems. Recent works have proposed using neural networks to learn word or utterance confidence scores for end-to-end ASR. In those studies, word conf
Externí odkaz:
http://arxiv.org/abs/2104.12870
Autor:
Qiu, David, Li, Qiujia, He, Yanzhang, Zhang, Yu, Li, Bo, Cao, Liangliang, Prabhavalkar, Rohit, Bhatia, Deepti, Li, Wei, Hu, Ke, Sainath, Tara N., McGraw, Ian
We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend natu
Externí odkaz:
http://arxiv.org/abs/2103.06716
The demand for fast and accurate incremental speech recognition increases as the applications of automatic speech recognition (ASR) proliferate. Incremental speech recognizers output chunks of partially recognized words while the user is still talkin
Externí odkaz:
http://arxiv.org/abs/2006.01416