Zobrazeno 1 - 10
of 929
pro vyhledávání: '"P. Paturi"'
Publikováno v:
AIP Advances, Vol 14, Iss 4, Pp 045309-045309-9 (2024)
Gd0.2Ca0.8MnO3 thin films were deposited on various substrate materials and their structural and resistive switching (RS) properties were investigated. The deposition resulted in epitaxial and polycrystalline films, with the latter also exhibiting di
Externí odkaz:
https://doaj.org/article/b493a65a2c48481d962edd06627ff8c5
End-to-end neural diarization (EEND) models offer significant improvements over traditional embedding-based Speaker Diarization (SD) approaches but falls short on generalizing to long-form audio with large number of speakers. EEND-vector-clustering m
Externí odkaz:
http://arxiv.org/abs/2406.18679
Speaker Diarization (SD) systems are typically audio-based and operate independently of the ASR system in traditional speech transcription pipelines and can have speaker errors due to SD and/or ASR reconciliation, especially around speaker turns and
Externí odkaz:
http://arxiv.org/abs/2406.17266
Large language models (LLMs) often improve their performance in downstream tasks when they generate Chain of Thought reasoning text before producing an answer. We investigate how LLMs recover from errors in Chain of Thought. Through analysis of error
Externí odkaz:
http://arxiv.org/abs/2405.15092
Autor:
Das, Nilaksh, Dingliwal, Saket, Ronanki, Srikanth, Paturi, Rohit, Huang, Zhaocheng, Mathur, Prashant, Yuan, Jie, Bekal, Dhanush, Niu, Xing, Jayanthi, Sai Muralidhar, Li, Xilai, Mundnich, Karel, Sunkara, Monica, Srinivasan, Sundararajan, Han, Kyu J, Kirchhoff, Katrin
Large language models (LLMs) have shown incredible proficiency in performing tasks that require semantic understanding of natural language instructions. Recently, many works have further expanded this capability to perceive multimodal audio and text
Externí odkaz:
http://arxiv.org/abs/2405.08295
Depth-3 circuit lower bounds and $k$-SAT algorithms are intimately related; the state-of-the-art $\Sigma^k_3$-circuit lower bound and the $k$-SAT algorithm are based on the same combinatorial theorem. In this paper we define a problem which reveals n
Externí odkaz:
http://arxiv.org/abs/2403.09134
Effective information retrieval (IR) in settings with limited training data, particularly for complex queries, remains a challenging task. This paper introduces IR2, Information Regularization for Information Retrieval, a technique for reducing overf
Externí odkaz:
http://arxiv.org/abs/2402.16200
We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO). BIRCO evaluates the ability of IR systems to retrieve documents given multi-faceted user objectives. The benchmark's complexity and compact size make it sui
Externí odkaz:
http://arxiv.org/abs/2402.14151
Autor:
Elluru, Veera Raghavendra, Kulshreshtha, Devang, Paturi, Rohit, Bodapati, Sravan, Ronanki, Srikanth
Spoken language understanding systems using audio-only data are gaining popularity, yet their ability to handle unseen intents remains limited. In this study, we propose a generalized zero-shot audio-to-intent classification framework with only a few
Externí odkaz:
http://arxiv.org/abs/2311.02482
Autor:
Zuluaga-Gomez, Juan, Huang, Zhaocheng, Niu, Xing, Paturi, Rohit, Srinivasan, Sundararajan, Mathur, Prashant, Thompson, Brian, Federico, Marcello
Conventional speech-to-text translation (ST) systems are trained on single-speaker utterances, and they may not generalize to real-life scenarios where the audio contains conversations by multiple speakers. In this paper, we tackle single-channel mul
Externí odkaz:
http://arxiv.org/abs/2311.00697