Výsledky vyhledávání - "Zuluaga-Gómez A"

Report

LM-assisted keyword biasing with Aho-Corasick algorithm for Transducer-based ASR

Autor: Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Carofilis, Andres, Kumar, Shashi, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind

Despite the recent success of end-to-end models for automatic speech recognition, recognizing special rare and out-of-vocabulary words, as well as fast domain adaptation with text, are still challenging. It often happens that biasing to the special e

Externí odkaz: http://arxiv.org/abs/2409.13514

Zobrazit plný text záznamu

Report

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper

Autor: Thorbecke, Iuliia, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Kumar, Shashi, Rangappa, Pradeep, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind

The training of automatic speech recognition (ASR) with little to no supervised data remains an open question. In this work, we demonstrate that streaming Transformer-Transducer (TT) models can be trained from scratch in consumer and accessible GPUs

Externí odkaz: http://arxiv.org/abs/2409.13499

Zobrazit plný text záznamu

Report

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR

Autor: Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Thorbecke, Iuliia, Villatoro-Tello, Esaú, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind

In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing

Externí odkaz: http://arxiv.org/abs/2407.04444

Zobrazit plný text záznamu

Report

XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models

Autor: Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Villatoro-Tello, Esaú, Thorbecke, Iuliia, Motlicek, Petr, E, Manjunath K, Ganapathiraju, Aravind

Self-supervised pretrained models exhibit competitive performance in automatic speech recognition on finetuning, even with limited in-domain supervised data. However, popular pretrained models are not suitable for streaming ASR because they are train

Externí odkaz: http://arxiv.org/abs/2407.04439

Zobrazit plný text záznamu

Report

Open-Source Conversational AI with SpeechBrain 1.0

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and

Externí odkaz: http://arxiv.org/abs/2407.00463

Zobrazit plný text záznamu

Report

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation

Autor: Zuluaga-Gomez, Juan, Huang, Zhaocheng, Niu, Xing, Paturi, Rohit, Srinivasan, Sundararajan, Mathur, Prashant, Thompson, Brian, Federico, Marcello

Conventional speech-to-text translation (ST) systems are trained on single-speaker utterances, and they may not generalize to real-life scenarios where the audio contains conversations by multiple speakers. In this paper, we tackle single-channel mul

Externí odkaz: http://arxiv.org/abs/2311.00697

Zobrazit plný text záznamu

Report

Implementing contextual biasing in GPU decoder for online ASR

Autor: Nigmatulina, Iuliia, Madikeri, Srikanth, Villatoro-Tello, Esaú, Motliček, Petr, Zuluaga-Gomez, Juan, Pandia, Karthik, Ganapathiraju, Aravind

GPU decoding significantly accelerates the output of ASR predictions. While GPUs are already being used for online ASR decoding, post-processing and rescoring on GPUs have not been properly investigated yet. Rescoring with available contextual inform

Externí odkaz: http://arxiv.org/abs/2306.15685

Zobrazit plný text záznamu

Report

CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice

Autor: Zuluaga-Gomez, Juan, Ahmed, Sara, Visockas, Danielius, Subakan, Cem

Despite the recent advancements in Automatic Speech Recognition (ASR), the recognition of accented speech still remains a dominant problem. In order to create more inclusive ASR systems, research has shown that the integration of accent information,

Externí odkaz: http://arxiv.org/abs/2305.18283

Zobrazit plný text záznamu

Report

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition

Autor: Mai, Florian, Zuluaga-Gomez, Juan, Parcollet, Titouan, Motlicek, Petr

State-of-the-art ASR systems have achieved promising results by modeling local and global interactions separately. While the former can be computed efficiently, global interactions are usually modeled via attention mechanisms, which are expensive for

Externí odkaz: http://arxiv.org/abs/2305.18281

Zobrazit plný text záznamu

Report

Breast Cancer Diagnosis Using Machine Learning Techniques

Autor: Zuluaga-Gomez, Juan

Breast cancer is one of the most threatening diseases in women's life; thus, the early and accurate diagnosis plays a key role in reducing the risk of death in a patient's life. Mammography stands as the reference technique for breast cancer screenin

Externí odkaz: http://arxiv.org/abs/2305.02482

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání