Zobrazeno 1 - 10
of 936
pro vyhledávání: '"Ariya, P. A."'
Autor:
Lu, Yen-Ju, Liu, Jing, Thebaud, Thomas, Moro-Velazquez, Laureano, Rastrow, Ariya, Dehak, Najim, Villalba, Jesus
We introduce Condition-Aware Self-Supervised Learning Representation (CA-SSLR), a generalist conditioning model broadly applicable to various speech-processing tasks. Compared to standard fine-tuning methods that optimize for downstream models, CA-SS
Externí odkaz:
http://arxiv.org/abs/2412.04425
Autor:
Shivakumar, Prashanth Gurunath, Kolehmainen, Jari, Gourav, Aditya, Gu, Yi, Gandhe, Ankur, Rastrow, Ariya, Bulyko, Ivan
Large language models (LLM) have demonstrated the ability to understand human language by leveraging large amount of text data. Automatic speech recognition (ASR) systems are often limited by available transcribed speech data and benefit from a secon
Externí odkaz:
http://arxiv.org/abs/2409.16654
Autor:
Tulsiani, Hitesh, Chan, David M., Ghosh, Shalini, Lalwani, Garima, Pandey, Prabhat, Bansal, Ankish, Garimella, Sri, Rastrow, Ariya, Hoffmeister, Björn
Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each
Externí odkaz:
http://arxiv.org/abs/2409.10515
Autor:
Kolehmainen, Jari, Gourav, Aditya, Shivakumar, Prashanth Gurunath, Gu, Yile, Gandhe, Ankur, Rastrow, Ariya, Strimel, Grant, Bulyko, Ivan
Retrieval is a widely adopted approach for improving language models leveraging external information. As the field moves towards multi-modal large language models, it is important to extend the pure text based methods to incorporate other modalities
Externí odkaz:
http://arxiv.org/abs/2406.09618
Autor:
Ou, Shuyu, Hassanifar, Mahyar, Votava, Martin, Langwasser, Marius, Liserre, Marco, Sangwongwanich, Ariya, Sahoo, Subham, Blaabjerg, Frede
The modular multilevel converter (MMC) is a topology that consists of a high number of capacitors, and degradation of capacitors can lead to converter malfunction, limiting the overall system lifetime. Condition monitoring methods can be applied to a
Externí odkaz:
http://arxiv.org/abs/2404.13399
The health status of power semiconductor devices in power converters is important but difficult to monitor. This paper analyzes the relationship between harmonics in inverter control variables and a health precursor (the on-state voltage Von of power
Externí odkaz:
http://arxiv.org/abs/2404.09733
Autor:
Yu, Yu, Yang, Chao-Han Huck, Dinh, Tuan, Ryu, Sungho, Kolehmainen, Jari, Ren, Roger, Filimonov, Denis, Shivakumar, Prashanth G., Gandhe, Ankur, Rastow, Ariya, Xu, Jia, Bulyko, Ivan, Stolcke, Andreas
The use of low-rank adaptation (LoRA) with frozen pretrained language models (PLMs) has become increasing popular as a mainstream, resource-efficient modeling approach for memory-constrained hardware. In this study, we first explore how to enhance mo
Externí odkaz:
http://arxiv.org/abs/2401.10447
Autor:
Raju, Anirudh, Khare, Aparna, He, Di, Sklyar, Ilya, Chen, Long, Alptekin, Sam, Trinh, Viet Anh, Zhang, Zhe, Vaz, Colin, Ravichandran, Venkatesh, Maas, Roland, Rastrow, Ariya
Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands. The endpoint detector has to trade-off between accuracy and latency, since waiting longer reduces the cases of users being
Externí odkaz:
http://arxiv.org/abs/2401.08916
Autor:
Everson, Kevin, Gu, Yile, Yang, Huck, Shivakumar, Prashanth Gurunath, Lin, Guan-Ting, Kolehmainen, Jari, Bulyko, Ivan, Gandhe, Ankur, Ghosh, Shalini, Hamza, Wael, Lee, Hung-yi, Rastrow, Ariya, Stolcke, Andreas
In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world s
Externí odkaz:
http://arxiv.org/abs/2401.02921
While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recogni
Externí odkaz:
http://arxiv.org/abs/2401.02417