Zobrazeno 1 - 10
of 1 624
pro vyhledávání: '"A. Villatoro-Tello"'
Autor:
Kumar, Shashi, Thorbecke, Iuliia, Burdisso, Sergio, Villatoro-Tello, Esaú, E, Manjunath K, Hacioğlu, Kadri, Rangappa, Pradeep, Motlicek, Petr, Ganapathiraju, Aravind, Stolcke, Andreas
Recent research has demonstrated that training a linear connector between speech foundation encoders and large language models (LLMs) enables this architecture to achieve strong ASR capabilities. Despite the impressive results, it remains unclear whe
Externí odkaz:
http://arxiv.org/abs/2411.03866
Bias assessment of news sources is paramount for professionals, organizations, and researchers who rely on truthful evidence for information gathering and reporting. While certain bias indicators are discernible from content analysis, descriptors lik
Externí odkaz:
http://arxiv.org/abs/2410.17655
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Carofilis, Andres, Kumar, Shashi, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
Despite the recent success of end-to-end models for automatic speech recognition, recognizing special rare and out-of-vocabulary words, as well as fast domain adaptation with text, are still challenging. It often happens that biasing to the special e
Externí odkaz:
http://arxiv.org/abs/2409.13514
Autor:
Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Kumar, Shashi, Rangappa, Pradeep, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
The training of automatic speech recognition (ASR) with little to no supervised data remains an open question. In this work, we demonstrate that streaming Transformer-Transducer (TT) models can be trained from scratch in consumer and accessible GPUs
Externí odkaz:
http://arxiv.org/abs/2409.13499
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Thorbecke, Iuliia, Villatoro-Tello, Esaú, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind
In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing
Externí odkaz:
http://arxiv.org/abs/2407.04444
Autor:
Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Thorbecke, Iuliia, Motlicek, Petr, E, Manjunath K, Ganapathiraju, Aravind
Self-supervised pretrained models exhibit competitive performance in automatic speech recognition on finetuning, even with limited in-domain supervised data. However, popular pretrained models are not suitable for streaming ASR because they are train
Externí odkaz:
http://arxiv.org/abs/2407.04439
Autor:
Burdisso, Sergio, Reyes-Ramírez, Ernesto, Villatoro-Tello, Esaú, Sánchez-Vega, Fernando, López-Monroy, Pastor, Motlicek, Petr
Automatic depression detection from conversational data has gained significant interest in recent years. The DAIC-WOZ dataset, interviews conducted by a human-controlled virtual agent, has been widely used for this task. Recent studies have reported
Externí odkaz:
http://arxiv.org/abs/2404.14463
Evaluating the reliability of news sources is a routine task for journalists and organizations committed to acquiring and disseminating accurate information. Recent research has shown that predicting sources' reliability represents an important first
Externí odkaz:
http://arxiv.org/abs/2404.09565
Publikováno v:
Interspeech 2023
We propose a simple approach for weighting self-connecting edges in a Graph Convolutional Network (GCN) and show its impact on depression detection from transcribed clinical interviews. To this end, we use a GCN for modeling non-consecutive and long-
Externí odkaz:
http://arxiv.org/abs/2307.00920
Autor:
Nigmatulina, Iuliia, Madikeri, Srikanth, Villatoro-Tello, Esaú, Motliček, Petr, Zuluaga-Gomez, Juan, Pandia, Karthik, Ganapathiraju, Aravind
GPU decoding significantly accelerates the output of ASR predictions. While GPUs are already being used for online ASR decoding, post-processing and rescoring on GPUs have not been properly investigated yet. Rescoring with available contextual inform
Externí odkaz:
http://arxiv.org/abs/2306.15685