Zobrazeno 1 - 10
of 460
pro vyhledávání: '"Wang Weiran"'
Autor:
Wang Weiran
Publikováno v:
SHS Web of Conferences, Vol 155, p 02013 (2023)
This study examines whether positive and negative news about celebrity endorsers affects the consumer behavior of Chinese Gen Z consumers. A quantitative research was conducted to measure Changes in Chinese Gen Z consumers’ attitudes, favorability,
Externí odkaz:
https://doaj.org/article/222f5335b2aa4f8884b1d0d00bc01c39
Autor:
Meng, Zhong, Wu, Zelin, Prabhavalkar, Rohit, Peyser, Cal, Wang, Weiran, Chen, Nanxin, Sainath, Tara N., Ramabhadran, Bhuvana
Publikováno v:
Interspeech 2024, Kos Island, Greece
Neural contextual biasing effectively improves automatic speech recognition (ASR) for crucial phrases within a speaker's context, particularly those that are infrequent in the training data. This work proposes contextual text injection (CTI) to enhan
Externí odkaz:
http://arxiv.org/abs/2406.02921
Autor:
Wu, Zelin, Song, Gan, Li, Christopher, Rondon, Pat, Meng, Zhong, Velez, Xavier, Wang, Weiran, Caseiro, Diamantino, Pundak, Golan, Munkhdalai, Tsendsuren, Chandorkar, Angad, Prabhavalkar, Rohit
Publikováno v:
2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics - Industry Track
Contextual biasing enables speech recognizers to transcribe important phrases in the speaker's context, such as contact names, even if they are rare in, or absent from, the training data. Attention-based biasing is a leading approach which allows for
Externí odkaz:
http://arxiv.org/abs/2404.10180
While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel Transformer architecture that leverages a feedback
Externí odkaz:
http://arxiv.org/abs/2404.09173
Autor:
Prabhavalkar, Rohit, Meng, Zhong, Wang, Weiran, Stooke, Adam, Cai, Xingyu, He, Yanzhang, Narayanan, Arun, Hwang, Dongseong, Sainath, Tara N., Moreno, Pedro J.
The accuracy of end-to-end (E2E) automatic speech recognition (ASR) models continues to improve as they are scaled to larger sizes, with some now reaching billions of parameters. Widespread deployment and adoption of these models, however, requires c
Externí odkaz:
http://arxiv.org/abs/2402.17184
Autor:
Ding, Shaojin, Qiu, David, Rim, David, He, Yanzhang, Rybakov, Oleg, Li, Bo, Prabhavalkar, Rohit, Wang, Weiran, Sainath, Tara N., Han, Zhonglin, Li, Jian, Yazdanbakhsh, Amir, Agrawal, Shivani
End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memo
Externí odkaz:
http://arxiv.org/abs/2312.08553
Autor:
Wang, Weiran, Wu, Zelin, Caseiro, Diamantino, Munkhdalai, Tsendsuren, Sim, Khe Chai, Rondon, Pat, Pundak, Golan, Song, Gan, Prabhavalkar, Rohit, Meng, Zhong, Zhao, Ding, Sainath, Tara, Mengibar, Pedro Moreno
Contextual biasing refers to the problem of biasing the automatic speech recognition (ASR) systems towards rare entities that are relevant to the specific user or application scenarios. We propose algorithms for contextual biasing based on the Knuth-
Externí odkaz:
http://arxiv.org/abs/2310.00178
Autor:
Wang, Weiran, Prabhavalkar, Rohit, Hwang, Dongseong, Li, Qiujia, Sim, Khe Chai, Li, Bo, Qin, James, Cai, Xingyu, Stooke, Adam, Meng, Zhong, Zheng, CJ, He, Yanzhang, Sainath, Tara, Mengibar, Pedro Moreno
In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model paramet
Externí odkaz:
http://arxiv.org/abs/2309.12963
Online speech recognition, where the model only accesses context to the left, is an important and challenging use case for ASR systems. In this work, we investigate augmenting neural encoders for online ASR by incorporating structured state-space seq
Externí odkaz:
http://arxiv.org/abs/2309.08551
While standard speaker diarization attempts to answer the question "who spoken when", most of relevant applications in reality are more interested in determining "who spoken what". Whether it is the conventional modularized approach or the more recen
Externí odkaz:
http://arxiv.org/abs/2309.08489