Zobrazeno 1 - 10
of 1 962
pro vyhledávání: '"GÓMEZ, JUAN P."'
The teaching innovation project SpaceRaceEdu: development of an educational multiplayer video game for self-study and self-assessment has been carried out under the INNOVA call of the Autonomous University of Madrid during the 2022-2023 academic year
Externí odkaz:
http://arxiv.org/abs/2410.13875
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Carofilis, Andres, Kumar, Shashi, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
Despite the recent success of end-to-end models for automatic speech recognition, recognizing special rare and out-of-vocabulary words, as well as fast domain adaptation with text, are still challenging. It often happens that biasing to the special e
Externí odkaz:
http://arxiv.org/abs/2409.13514
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Kumar, Shashi, Rangappa, Pradeep, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
The training of automatic speech recognition (ASR) with little to no supervised data remains an open question. In this work, we demonstrate that streaming Transformer-Transducer (TT) models can be trained from scratch in consumer and accessible GPUs
Externí odkaz:
http://arxiv.org/abs/2409.13499
Autor:
Tang, William, Feibush, Eliot, Dong, Ge, Borthwick, Noah, Lee, Apollo, Gomez, Juan-Felipe, Gibbs, Tom, Stone, John, Messmer, Peter, Wells, Jack, Wei, Xishuo, Lin, Zhihong
In addressing the Department of Energy's April, 2022 announcement of a Bold Decadal Vision for delivering a Fusion Pilot Plant by 2035, associated software tools need to be developed for the integration of real world engineering and supply chain data
Externí odkaz:
http://arxiv.org/abs/2409.03112
Autor:
Rodriguez-Gomez, Juan Pablo, Dios, Jose Ramiro Martinez-de, Ollero, Anibal, Gallego, Guillermo
Publikováno v:
IEEE Robotics and Automation Letters (RA-L), 2024
Vision-based perception systems are typically exposed to large orientation changes in different robot applications. In such conditions, their performance might be compromised due to the inherent complexity of processing data captured under challengin
Externí odkaz:
http://arxiv.org/abs/2408.15602
Autor:
Wedemeyer, Sven, Szydlarski, Mikolaj, Toribio, M. Carmen, Carozzi, Tobia, Jakobsson, Daniel, Gomez, Juan Camilo Guevara, Eklund, Henrik, Henriques, Vasco M. J., Jafarzadeh, Shahin, Rodriguez, Jaime de la Cruz
The Atacama Large Millimeter/submillimeter Array (ALMA) offers new diagnostic capabilities for studying the Sun, providing complementary insights through high spatial and temporal resolution at millimeter wavelengths. ALMA acts as a linear thermomete
Externí odkaz:
http://arxiv.org/abs/2408.14265
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Thorbecke, Iuliia, Villatoro-Tello, Esaú, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing
Externí odkaz:
http://arxiv.org/abs/2407.04444
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Thorbecke, Iuliia, Motlicek, Petr, E, Manjunath K, Ganapathiraju, Aravind
Self-supervised pretrained models exhibit competitive performance in automatic speech recognition on finetuning, even with limited in-domain supervised data. However, popular pretrained models are not suitable for streaming ASR because they are train
Externí odkaz:
http://arxiv.org/abs/2407.04439
Autor:
Kulynych, Bogdan, Gomez, Juan Felipe, Kaissis, Georgios, Calmon, Flavio du Pin, Troncoso, Carmela
Differential privacy (DP) is a widely used approach for mitigating privacy risks when training machine learning models on sensitive data. DP mechanisms add noise during training to limit the risk of information leakage. The scale of the added noise i
Externí odkaz:
http://arxiv.org/abs/2407.02191
Autor:
Ravanelli, Mirco, Parcollet, Titouan, Moumen, Adel, de Langen, Sylvain, Subakan, Cem, Plantinga, Peter, Wang, Yingzhi, Mousavi, Pooneh, Della Libera, Luca, Ploujnikov, Artem, Paissan, Francesco, Borra, Davide, Zaiem, Salah, Zhao, Zeyu, Zhang, Shucong, Karakasidis, Georgios, Yeh, Sung-Lin, Champion, Pierre, Rouhe, Aku, Braun, Rudolf, Mai, Florian, Zuluaga-Gomez, Juan, Mousavi, Seyed Mahed, Nautsch, Andreas, Liu, Xuechen, Sagar, Sangeet, Duret, Jarod, Mdhaffar, Salima, Laperriere, Gaelle, Rouvier, Mickael, De Mori, Renato, Esteve, Yannick
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and
Externí odkaz:
http://arxiv.org/abs/2407.00463