Zobrazeno 1 - 10
of 44
pro vyhledávání: '"David Rybach"'
Autor:
Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley
In this work, we study the impact of Large-scale Language Models (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR. We demonstrate up to 8\% relative reduction in Word Error Eate (WER) on US Eng
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0d6983808ef866164081e3c5a2b3ffe8
http://arxiv.org/abs/2306.08133
http://arxiv.org/abs/2306.08133
Autor:
Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-Yiin Chang, David Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Ehsan Variani, Michael Riley, David Rybach, Cyril Allauzen, Tongzhou Chen, Bhuvana Ramabhadran
Publikováno v:
Interspeech 2022.
Autor:
W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Tara Sainath, Rohit Prabhavalkar, Cal Peyser, Zhiyun Lu, Cyril Allauzen
Improving the performance of end-to-end ASR models on long utterances ranging from minutes to hours in length is an ongoing challenge in speech recognition. A common solution is to segment the audio in advance using a separate voice activity detector
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e9c15fa03f1db0e0b6fc4f4d4951f31a
Publikováno v:
Interspeech 2021.
Autor:
Trevor Strohman, Yanzhang He, Sean Campbell, Tara N. Sainath, Rohit Prabhavalkar, David Rybach, Arun Narayanan
Publikováno v:
ICASSP
End-to-end models that condition the output label sequence on all previously predicted labels have emerged as popular alternatives to conventional systems for automatic speech recognition (ASR). Since unique label histories correspond to distinct mod
We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table. In particular, we ins
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d2954399b811bdd2ccf1818fa16057d9
http://arxiv.org/abs/2104.04552
http://arxiv.org/abs/2104.04552
Publikováno v:
INTERSPEECH
Publikováno v:
INTERSPEECH
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency
Autor:
Zhifeng Chen, Ian McGraw, David Garcia, Mirko Visontai, Yuan Shangguan, Bo Li, Yanzhang He, Qiao Liang, Antoine Bruguier, Tara N. Sainath, Yash Sheth, Yu Zhang, Golan Pundak, Chung-Cheng Chiu, Raziel Alvarez, Ke Hu, Cal Peyser, David Rybach, Alex Gruenstein, Yonghui Wu, Trevor Strohman, Ruoming Pang, Ding Zhao, Rohit Prabhavalkar, Arun Narayanan, Shuo-Yiin Chang, Wei Li, Anjuli Kannan
Publikováno v:
ICASSP
Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking.