Výsledky vyhledávání - "Ariya, P. A."

Report

CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing

Autor: Lu, Yen-Ju, Liu, Jing, Thebaud, Thomas, Moro-Velazquez, Laureano, Rastrow, Ariya, Dehak, Najim, Villalba, Jesus

We introduce Condition-Aware Self-Supervised Learning Representation (CA-SSLR), a generalist conditioning model broadly applicable to various speech-processing tasks. Compared to standard fine-tuning methods that optimize for downstream models, CA-SS

Externí odkaz: http://arxiv.org/abs/2412.04425

Zobrazit plný text záznamu

Report

Speech Recognition Rescoring with Large Speech-Text Foundation Models

Autor: Shivakumar, Prashanth Gurunath, Kolehmainen, Jari, Gourav, Aditya, Gu, Yi, Gandhe, Ankur, Rastrow, Ariya, Bulyko, Ivan

Large language models (LLM) have demonstrated the ability to understand human language by leveraging large amount of text data. Automatic speech recognition (ASR) systems are often limited by available transcribed speech data and benefit from a secon

Externí odkaz: http://arxiv.org/abs/2409.16654

Zobrazit plný text záznamu

Report

An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems

Autor: Tulsiani, Hitesh, Chan, David M., Ghosh, Shalini, Lalwani, Garima, Pandey, Prabhat, Bansal, Ankish, Garimella, Sri, Rastrow, Ariya, Hoffmeister, Björn

Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each

Externí odkaz: http://arxiv.org/abs/2409.10515

Zobrazit plný text záznamu

Report

Multi-Modal Retrieval For Large Language Model Based Speech Recognition

Autor: Kolehmainen, Jari, Gourav, Aditya, Shivakumar, Prashanth Gurunath, Gu, Yile, Gandhe, Ankur, Rastrow, Ariya, Strimel, Grant, Bulyko, Ivan

Retrieval is a widely adopted approach for improving language models leveraging external information. As the field moves towards multi-modal large language models, it is important to extend the pure text based methods to incorporate other modalities

Externí odkaz: http://arxiv.org/abs/2406.09618

Zobrazit plný text záznamu

Report

A Data-Driven Condition Monitoring Method for Capacitor in Modular Multilevel Converter (MMC)

Autor: Ou, Shuyu, Hassanifar, Mahyar, Votava, Martin, Langwasser, Marius, Liserre, Marco, Sangwongwanich, Ariya, Sahoo, Subham, Blaabjerg, Frede

The modular multilevel converter (MMC) is a topology that consists of a high number of capacitors, and degradation of capacitors can lead to converter malfunction, limiting the overall system lifetime. Condition monitoring methods can be applied to a

Externí odkaz: http://arxiv.org/abs/2404.13399

Zobrazit plný text záznamu

Report

Semiconductor Devices Condition Monitoring Using Harmonics in Inverter Control Variables

Autor: Ou, Shuyu, Sangwongwanich, Ariya, Sahoo, Subham, Blaabjerg, Frede

The health status of power semiconductor devices in power converters is important but difficult to monitor. This paper analyzes the relationship between harmonics in inverter control variables and a health precursor (the on-state voltage Von of power

Externí odkaz: http://arxiv.org/abs/2404.09733

Zobrazit plný text záznamu

Report

Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition

Autor: Yu, Yu, Yang, Chao-Han Huck, Dinh, Tuan, Ryu, Sungho, Kolehmainen, Jari, Ren, Roger, Filimonov, Denis, Shivakumar, Prashanth G., Gandhe, Ankur, Rastow, Ariya, Xu, Jia, Bulyko, Ivan, Stolcke, Andreas

The use of low-rank adaptation (LoRA) with frozen pretrained language models (PLMs) has become increasing popular as a mainstream, resource-efficient modeling approach for memory-constrained hardware. In this study, we first explore how to enhance mo

Externí odkaz: http://arxiv.org/abs/2401.10447

Zobrazit plný text záznamu

Report

Two-pass Endpoint Detection for Speech Recognition

Autor: Raju, Anirudh, Khare, Aparna, He, Di, Sklyar, Ilya, Chen, Long, Alptekin, Sam, Trinh, Viet Anh, Zhang, Zhe, Vaz, Colin, Ravichandran, Venkatesh, Maas, Roland, Rastrow, Ariya

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands. The endpoint detector has to trade-off between accuracy and latency, since waiting longer reduces the cases of users being

Externí odkaz: http://arxiv.org/abs/2401.08916

Zobrazit plný text záznamu

Report

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

Autor: Everson, Kevin, Gu, Yile, Yang, Huck, Shivakumar, Prashanth Gurunath, Lin, Guan-Ting, Kolehmainen, Jari, Bulyko, Ivan, Gandhe, Ankur, Ghosh, Shalini, Hamza, Wael, Lee, Hung-yi, Rastrow, Ariya, Stolcke, Andreas

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world s

Externí odkaz: http://arxiv.org/abs/2401.02921

Zobrazit plný text záznamu

Report

Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition

Autor: Chan, David M., Ghosh, Shalini, Tulsiani, Hitesh, Rastrow, Ariya, Hoffmeister, Björn

While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recogni

Externí odkaz: http://arxiv.org/abs/2401.02417

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání