Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Anjuli Kannan"'
Publikováno v:
Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems.
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
Autor:
Zhifeng Chen, Ian McGraw, David Garcia, Mirko Visontai, Yuan Shangguan, Bo Li, Yanzhang He, Qiao Liang, Antoine Bruguier, Tara N. Sainath, Yash Sheth, Yu Zhang, Golan Pundak, Chung-Cheng Chiu, Raziel Alvarez, Ke Hu, Cal Peyser, David Rybach, Alex Gruenstein, Yonghui Wu, Trevor Strohman, Ruoming Pang, Ding Zhao, Rohit Prabhavalkar, Arun Narayanan, Shuo-Yiin Chang, Wei Li, Anjuli Kannan
Publikováno v:
ICASSP
Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking.
Publikováno v:
ICASSP
Multilingual Automated Speech Recognition (ASR) systems allow for the joint training of data-rich and data-scarce languages in a single model. This enables data and parameter sharing across languages, which is especially beneficial for the data-scarc
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6a151204da3e0d461f9892fc4c305984
http://arxiv.org/abs/2004.09571
http://arxiv.org/abs/2004.09571
Autor:
Shuyuan Zhang, Chung-Cheng Chiu, Sergey Kishchenko, Tara N. Sainath, Zhifeng Chen, Hank Liao, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Anjuli Kannan, Ruoming Pang, Patrick Nguyen, Yonghui Wu
Publikováno v:
ASRU
End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies have focused
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d880aa8ac41e590195c8425ca98963de
http://arxiv.org/abs/1911.02242
http://arxiv.org/abs/1911.02242
Autor:
Yonghui Wu, Zhifeng Chen, Eugene Weinstein, Tara N. Sainath, Seungji Lee, Anjuli Kannan, Arindrima Datta, Ankur Bapna, Bhuvana Ramabhadran
Publikováno v:
INTERSPEECH
Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic speech recognition (ASR) coverage of the world's languages. They have shown improvement over monolingual systems, and have simplified training and serving by elim
Autor:
Anjuli Kannan, Khe Chai Sim, Qiao Liang, Tom Bagby, Yuan Shangguan, Yanzhang He, Rohit Prabhavalkar, Ruoming Pang, David Rybach, Golan Pundak, Ian McGraw, Deepti Bhatia, Yonghui Wu, Shuo-Yiin Chang, Bo Li, Ding Zhao, Kanishka Rao, Alexander H. Gruenstein, Tara N. Sainath, Raziel Alvarez
Publikováno v:
ICASSP
End-to-end (E2E) models, which directly predict output character sequences given input speech, are good candidates for on-device speech recognition. E2E models, however, present numerous challenges: In order to be truly useful, such models must decod
Autor:
Anjuli Kannan, Jeffrey Dean, Katherine Chou, Laura Vardoulakis, Alvin Rajkomar, Kai Chen, Claire Cui
Publikováno v:
JAMA internal medicine. 179(6)
This study assesses the feasibility of using machine learning to automatically populate a review of systems of all symptoms discussed in an encounter between a patient and a clinician.
Autor:
Antoine Bruguier, Rohit Prabhavalkar, Patrick Nguyen, David Rybach, Kazuki Irie, Anjuli Kannan
Publikováno v:
INTERSPEECH
In conventional speech recognition, phoneme-based models outperform grapheme-based models for non-phonetic languages such as English. The performance gap between the two typically reduces as the amount of training data is increased. In this work, we
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4d6e95fd61b57d6cd1c377c963700d91
http://arxiv.org/abs/1902.01955
http://arxiv.org/abs/1902.01955
Publikováno v:
ACL (1)
This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. Lack of any publicly available corpus in this privacy-sensitive domain led us to develop ou
Autor:
Karen Livescu, Anjuli Kannan, Tara N. Sainath, Yonghui Wu, Chung-Cheng Chiu, Shubham Toshniwal
Publikováno v:
SLT
Attention-based recurrent neural encoder-decoder models present an elegant solution to the automatic speech recognition problem. This approach folds the acoustic model, pronunciation model, and language model into a single network and requires only a