Zobrazeno 1 - 10
of 2 626
pro vyhledávání: '"MORENO, PEDRO"'
Autor:
Liu, Hongbin, Chen, Youzheng, Narayanan, Arun, Balachandran, Athula, Moreno, Pedro J., Wang, Lun
Recent advances in text-to-speech (TTS) systems, particularly those with voice cloning capabilities, have made voice impersonation readily accessible, raising ethical and legal concerns due to potential misuse for malicious activities like misinforma
Externí odkaz:
http://arxiv.org/abs/2410.06572
Autor:
Ramírez-Moreno, Pedro
We study the symbolic $F$-splitness of families of binomial edge ideals. We also study the strong $F$-regularity of the symbolic blowup algebras of families of binomial edge ideals. We make use of Fedder-like criteria and combinatorial properties of
Externí odkaz:
http://arxiv.org/abs/2404.14640
Autor:
Prabhavalkar, Rohit, Meng, Zhong, Wang, Weiran, Stooke, Adam, Cai, Xingyu, He, Yanzhang, Narayanan, Arun, Hwang, Dongseong, Sainath, Tara N., Moreno, Pedro J.
The accuracy of end-to-end (E2E) automatic speech recognition (ASR) models continues to improve as they are scaled to larger sizes, with some now reaching billions of parameters. Widespread deployment and adoption of these models, however, requires c
Externí odkaz:
http://arxiv.org/abs/2402.17184
Autor:
Chen, Tongzhou, Allauzen, Cyril, Huang, Yinghui, Park, Daniel, Rybach, David, Huang, W. Ronny, Cabrera, Rodrigo, Audhkhasi, Kartik, Ramabhadran, Bhuvana, Moreno, Pedro J., Riley, Michael
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
In this work, we study the impact of Large-scale Language Models (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR. We demonstrate up to 8\% relative reduction in Word Error Eate (WER) on US Eng
Externí odkaz:
http://arxiv.org/abs/2306.08133
Autor:
Zhang, Yu, Han, Wei, Qin, James, Wang, Yongqiang, Bapna, Ankur, Chen, Zhehuai, Chen, Nanxin, Li, Bo, Axelrod, Vera, Wang, Gary, Meng, Zhong, Hu, Ke, Rosenberg, Andrew, Prabhavalkar, Rohit, Park, Daniel S., Haghani, Parisa, Riesa, Jason, Perng, Ginger, Soltau, Hagen, Strohman, Trevor, Ramabhadran, Bhuvana, Sainath, Tara, Moreno, Pedro, Chiu, Chung-Cheng, Schalkwyk, Johan, Beaufays, Françoise, Wu, Yonghui
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages. This is achieved by pre-training the encoder of the model on a large unlabeled multilingual dataset of 12 mill
Externí odkaz:
http://arxiv.org/abs/2303.01037