Zobrazeno 1 - 10
of 28
pro vyhledávání: '"Ian McGraw"'
Autor:
Tom O’Malley, Shaojin Ding, Arun Narayanan, Quan Wang, Rajeev Rikhye, Qiao Liang, Yanzhang He, Ian McGraw
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-Yiin Chang, David Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Shaojin Ding, Wang Weiran, Ding Zhao, Tara Sainath, Yanzhang He, Robert David, Rami Botros, Xin Wang, Rina Panigrahy, Qiao Liang, Dongseong Hwang, Ian McGraw, Rohit Prabhavalkar, Trevor Strohman
In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of qual
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d8444eebad257eba5a9ce7891486e95e
http://arxiv.org/abs/2204.06164
http://arxiv.org/abs/2204.06164
Autor:
Quan Wang, Ian McGraw, Arun Narayanan, Yanzhang He, Ding Zhao, Qiao Liang, Rajeev V. Rikhye, Yiteng Huang
Publikováno v:
Interspeech 2021.
Publikováno v:
Interspeech 2021.
Confidence scores are very useful for downstream applications of automatic speech recognition (ASR) systems. Recent works have proposed using neural networks to learn word or utterance confidence scores for end-to-end ASR. In those studies, word conf
Autor:
Yanzhang He, Ke Hu, Rohit Prabhavalkar, Deepti Bhatia, Yu Zhang, Wei Li, David Qiu, Qiujia Li, Tara N. Sainath, Bo Li, Ian McGraw, Liangliang Cao
Publikováno v:
ICASSP
We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend natu
In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass. This is achieved by using an attention mechanism on multiple speaker embeddi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::308a8e8a0f4ff2e3f0fd373a0cd2d01a
Publikováno v:
INTERSPEECH
The demand for fast and accurate incremental speech recognition increases as the applications of automatic speech recognition (ASR) proliferate. Incremental speech recognizers output chunks of partially recognized words while the user is still talkin
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5eeb82a8c596cc512b5c769a7746f3e9
http://arxiv.org/abs/2006.01416
http://arxiv.org/abs/2006.01416
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
Autor:
Zhifeng Chen, Ian McGraw, David Garcia, Mirko Visontai, Yuan Shangguan, Bo Li, Yanzhang He, Qiao Liang, Antoine Bruguier, Tara N. Sainath, Yash Sheth, Yu Zhang, Golan Pundak, Chung-Cheng Chiu, Raziel Alvarez, Ke Hu, Cal Peyser, David Rybach, Alex Gruenstein, Yonghui Wu, Trevor Strohman, Ruoming Pang, Ding Zhao, Rohit Prabhavalkar, Arun Narayanan, Shuo-Yiin Chang, Wei Li, Anjuli Kannan
Publikováno v:
ICASSP
Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking.
Autor:
Nathan A. James, Ian McGraw, C. Keith Stone, Margaret K. Strecker-McGraw, Jared Glenn, Karim Jabbar
Publikováno v:
The Journal of Emergency Medicine. 53:717-721
Background The treatment of acute ischemic stroke with recombinant tissue plasminogen activator (rtPA) has become the mainstay of treatment, but its use carries a risk of subsequent intracranial hemorrhage (ICH). Guidelines have been developed to aid