Zobrazeno 1 - 10
of 2 224
pro vyhledávání: '"Moreno Pedro A"'
Autor:
Liu, Hongbin, Chen, Youzheng, Narayanan, Arun, Balachandran, Athula, Moreno, Pedro J., Wang, Lun
Recent advances in text-to-speech (TTS) systems, particularly those with voice cloning capabilities, have made voice impersonation readily accessible, raising ethical and legal concerns due to potential misuse for malicious activities like misinforma
Externí odkaz:
http://arxiv.org/abs/2410.06572
Autor:
Ramírez-Moreno, Pedro
We study the symbolic $F$-splitness of families of binomial edge ideals. We also study the strong $F$-regularity of the symbolic blowup algebras of families of binomial edge ideals. We make use of Fedder-like criteria and combinatorial properties of
Externí odkaz:
http://arxiv.org/abs/2404.14640
Autor:
Prabhavalkar, Rohit, Meng, Zhong, Wang, Weiran, Stooke, Adam, Cai, Xingyu, He, Yanzhang, Narayanan, Arun, Hwang, Dongseong, Sainath, Tara N., Moreno, Pedro J.
The accuracy of end-to-end (E2E) automatic speech recognition (ASR) models continues to improve as they are scaled to larger sizes, with some now reaching billions of parameters. Widespread deployment and adoption of these models, however, requires c
Externí odkaz:
http://arxiv.org/abs/2402.17184
Autor:
Chen, Tongzhou, Allauzen, Cyril, Huang, Yinghui, Park, Daniel, Rybach, David, Huang, W. Ronny, Cabrera, Rodrigo, Audhkhasi, Kartik, Ramabhadran, Bhuvana, Moreno, Pedro J., Riley, Michael
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
In this work, we study the impact of Large-scale Language Models (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR. We demonstrate up to 8\% relative reduction in Word Error Eate (WER) on US Eng
Externí odkaz:
http://arxiv.org/abs/2306.08133
Autor:
Zhang, Yu, Han, Wei, Qin, James, Wang, Yongqiang, Bapna, Ankur, Chen, Zhehuai, Chen, Nanxin, Li, Bo, Axelrod, Vera, Wang, Gary, Meng, Zhong, Hu, Ke, Rosenberg, Andrew, Prabhavalkar, Rohit, Park, Daniel S., Haghani, Parisa, Riesa, Jason, Perng, Ginger, Soltau, Hagen, Strohman, Trevor, Ramabhadran, Bhuvana, Sainath, Tara, Moreno, Pedro, Chiu, Chung-Cheng, Schalkwyk, Johan, Beaufays, Françoise, Wu, Yonghui
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages. This is achieved by pre-training the encoder of the model on a large unlabeled multilingual dataset of 12 mill
Externí odkaz:
http://arxiv.org/abs/2303.01037
Autor:
Moreno, Pedro, Rocha, Ricardo
Lock-free data structures are an important tool for the development of concurrent programs as they provide scalability, low latency and avoid deadlocks, livelocks and priority inversion. However, they require some sort of additional support to guaran
Externí odkaz:
http://arxiv.org/abs/2302.06520
Publikováno v:
Acta Agronómica, Vol 59, Iss 3, Pp 257-272 (2010)
La masiva cantidad de datos biológicos provenientes de las disciplinas "ómicas" y su aprovechamiento en el mejoramiento genético vegetal requiere de nuevos abordajes teóricos y estadísticos que describan de forma satisfactoria principios general
Externí odkaz:
https://doaj.org/article/d18f30b0c77648c9b3f2f0fe4539f5b0
Autor:
Meng, Zhong, Chen, Tongzhou, Prabhavalkar, Rohit, Zhang, Yu, Wang, Gary, Audhkhasi, Kartik, Emond, Jesse, Strohman, Trevor, Ramabhadran, Bhuvana, Huang, W. Ronny, Variani, Ehsan, Huang, Yinghui, Moreno, Pedro J.
Publikováno v:
2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar
Text-only adaptation of a transducer model remains challenging for end-to-end speech recognition since the transducer has no clearly separated acoustic model (AM), language model (LM) or blank model. In this work, we propose a modular hybrid autoregr
Externí odkaz:
http://arxiv.org/abs/2210.17049
Autor:
Wang, Gary, Cubuk, Ekin D., Rosenberg, Andrew, Cheng, Shuyang, Weiss, Ron J., Ramabhadran, Bhuvana, Moreno, Pedro J., Le, Quoc V., Park, Daniel S.
Data augmentation is a ubiquitous technique used to provide robustness to automatic speech recognition (ASR) training. However, even as so much of the ASR training process has become automated and more "end-to-end", the data augmentation policy (what
Externí odkaz:
http://arxiv.org/abs/2210.10879