Zobrazeno 1 - 10
of 5 019
pro vyhledávání: '"Zuluaga, P."'
Wearable devices like smartwatches, wristbands, and fitness trackers are designed to be lightweight devices to be worn on the human body. With the increased connectivity of wearable devices, they will become integral to remote healthcare solutions. F
Externí odkaz:
http://arxiv.org/abs/2410.07629
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Carofilis, Andres, Kumar, Shashi, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
Despite the recent success of end-to-end models for automatic speech recognition, recognizing special rare and out-of-vocabulary words, as well as fast domain adaptation with text, are still challenging. It often happens that biasing to the special e
Externí odkaz:
http://arxiv.org/abs/2409.13514
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Kumar, Shashi, Rangappa, Pradeep, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
The training of automatic speech recognition (ASR) with little to no supervised data remains an open question. In this work, we demonstrate that streaming Transformer-Transducer (TT) models can be trained from scratch in consumer and accessible GPUs
Externí odkaz:
http://arxiv.org/abs/2409.13499
Autor:
Sucerquia, Mario, Alvarado-Montes, Jaime A., Zuluaga, Jorge I., Cuello, Nicolás, Cuadra, Jorge, Montesinos, Matías
Rings are complex structures surrounding giant planets and some minor bodies in the Solar System. While some formation mechanisms could also potentially foster their existence around (regular or irregular) satellites, none of these bodies currently b
Externí odkaz:
http://arxiv.org/abs/2408.10643
The Ohio State University Big Ear radio telescope detected in 1977 the Wow! Signal, one of the most famous and intriguing signals of extraterrestrial origin. Characterized by its strong relative intensity and narrow bandwidth near the 1420 MHz hydrog
Externí odkaz:
http://arxiv.org/abs/2408.08513
Combining multiple modalities carrying complementary information through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is often overlo
Externí odkaz:
http://arxiv.org/abs/2407.20768
The search for atmospheric biosignatures in Earth-like exoplanets is one of the most pressing challenges in observational astrobiology. Detecting biogenic gases in terrestrial planets requires high resolution and long integration times. In this work,
Externí odkaz:
http://arxiv.org/abs/2407.19167
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Thorbecke, Iuliia, Villatoro-Tello, Esaú, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing
Externí odkaz:
http://arxiv.org/abs/2407.04444
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Thorbecke, Iuliia, Motlicek, Petr, E, Manjunath K, Ganapathiraju, Aravind
Self-supervised pretrained models exhibit competitive performance in automatic speech recognition on finetuning, even with limited in-domain supervised data. However, popular pretrained models are not suitable for streaming ASR because they are train
Externí odkaz:
http://arxiv.org/abs/2407.04439
Autor:
Ravanelli, Mirco, Parcollet, Titouan, Moumen, Adel, de Langen, Sylvain, Subakan, Cem, Plantinga, Peter, Wang, Yingzhi, Mousavi, Pooneh, Della Libera, Luca, Ploujnikov, Artem, Paissan, Francesco, Borra, Davide, Zaiem, Salah, Zhao, Zeyu, Zhang, Shucong, Karakasidis, Georgios, Yeh, Sung-Lin, Champion, Pierre, Rouhe, Aku, Braun, Rudolf, Mai, Florian, Zuluaga-Gomez, Juan, Mousavi, Seyed Mahed, Nautsch, Andreas, Liu, Xuechen, Sagar, Sangeet, Duret, Jarod, Mdhaffar, Salima, Laperriere, Gaelle, Rouvier, Mickael, De Mori, Renato, Esteve, Yannick
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and
Externí odkaz:
http://arxiv.org/abs/2407.00463