Zobrazeno 1 - 10
of 851
pro vyhledávání: '"Stolcke, A."'
With the recent proliferation of large language models (LLMs), enterprises have been able to rapidly develop proof-of-concepts and prototypes. As a result, there is a growing need to implement robust guardrails that monitor, quantize and control an L
Externí odkaz:
http://arxiv.org/abs/2411.14398
Autor:
Kumar, Shashi, Thorbecke, Iuliia, Burdisso, Sergio, Villatoro-Tello, Esaú, E, Manjunath K, Hacioğlu, Kadri, Rangappa, Pradeep, Motlicek, Petr, Ganapathiraju, Aravind, Stolcke, Andreas
Recent research has demonstrated that training a linear connector between speech foundation encoders and large language models (LLMs) enables this architecture to achieve strong ASR capabilities. Despite the impressive results, it remains unclear whe
Externí odkaz:
http://arxiv.org/abs/2411.03866
Autor:
Sankararaman, Hithesh, Yasin, Mohammed Nasheed, Sorensen, Tanner, Di Bari, Alessandro, Stolcke, Andreas
Publikováno v:
Proc. 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pp. 1305-1313
We present a light-weight approach for detecting nonfactual outputs from retrieval-augmented generation (RAG). Given a context and putative output, we compute a factuality score that can be thresholded to yield a binary decision to check the results
Externí odkaz:
http://arxiv.org/abs/2411.01022
Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA), relying on retrieving relevant documents from a vector store computed using a pretrained embedding model. However, if the retrieved context is
Externí odkaz:
http://arxiv.org/abs/2410.12890
Autor:
Yang, Chao-Han Huck, Park, Taejin, Gong, Yuan, Li, Yuanchao, Chen, Zhehuai, Lin, Yen-Ting, Chen, Chen, Hu, Yuchen, Dhawan, Kunal, Żelasko, Piotr, Zhang, Chao, Chen, Yun-Nung, Tsao, Yu, Balam, Jagadeesh, Ginsburg, Boris, Siniscalchi, Sabato Marco, Chng, Eng Siong, Bell, Peter, Lai, Catherine, Watanabe, Shinji, Stolcke, Andreas
Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To explore new c
Externí odkaz:
http://arxiv.org/abs/2409.09785
Autor:
Wang, Jinhan, Chen, Long, Khare, Aparna, Raju, Anirudh, Dheram, Pranav, He, Di, Wu, Minhua, Stolcke, Andreas, Ravichandran, Venkatesh
We propose an approach for continuous prediction of turn-taking and backchanneling locations in spoken dialogue by fusing a neural acoustic model with a large language model (LLM). Experiments on the Switchboard human-human conversation dataset demon
Externí odkaz:
http://arxiv.org/abs/2401.14717
Automated speaker identification (SID) is a crucial step for the personalization of a wide range of speech-enabled services. Typical SID systems use a symmetric enrollment-verification framework with a single model to derive embeddings both offline f
Externí odkaz:
http://arxiv.org/abs/2401.12440
Autor:
Yu, Yu, Yang, Chao-Han Huck, Dinh, Tuan, Ryu, Sungho, Kolehmainen, Jari, Ren, Roger, Filimonov, Denis, Shivakumar, Prashanth G., Gandhe, Ankur, Rastow, Ariya, Xu, Jia, Bulyko, Ivan, Stolcke, Andreas
The use of low-rank adaptation (LoRA) with frozen pretrained language models (PLMs) has become increasing popular as a mainstream, resource-efficient modeling approach for memory-constrained hardware. In this study, we first explore how to enhance mo
Externí odkaz:
http://arxiv.org/abs/2401.10447
Autor:
Everson, Kevin, Gu, Yile, Yang, Huck, Shivakumar, Prashanth Gurunath, Lin, Guan-Ting, Kolehmainen, Jari, Bulyko, Ivan, Gandhe, Ankur, Ghosh, Shalini, Hamza, Wael, Lee, Hung-yi, Rastrow, Ariya, Stolcke, Andreas
In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world s
Externí odkaz:
http://arxiv.org/abs/2401.02921
Autor:
Lin, Guan-Ting, Shivakumar, Prashanth Gurunath, Gandhe, Ankur, Yang, Chao-Han Huck, Gu, Yile, Ghosh, Shalini, Stolcke, Andreas, Lee, Hung-yi, Bulyko, Ivan
Large Language Models (LLMs) have demonstrated superior abilities in tasks such as chatting, reasoning, and question-answering. However, standard LLMs may ignore crucial paralinguistic information, such as sentiment, emotion, and speaking style, whic
Externí odkaz:
http://arxiv.org/abs/2312.15316