Zobrazeno 1 - 10
of 3 410
pro vyhledávání: '"Chan David"'
The Automated Audio Captioning (AAC) task asks models to generate natural language descriptions of an audio input. Evaluating these machine-generated audio captions is a complex task that requires considering diverse factors, among them, auditory sce
Externí odkaz:
http://arxiv.org/abs/2409.12962
Autor:
Tulsiani, Hitesh, Chan, David M., Ghosh, Shalini, Lalwani, Garima, Pandey, Prabhat, Bansal, Ankish, Garimella, Sri, Rastrow, Ariya, Hoffmeister, Björn
Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each
Externí odkaz:
http://arxiv.org/abs/2409.10515
Assessing personality traits using large language models (LLMs) has emerged as an interesting and challenging area of research. While previous methods employ explicit questionnaires, often derived from the Big Five model of personality, we hypothesiz
Externí odkaz:
http://arxiv.org/abs/2409.09905
Autor:
Wu, Tsung-Han, Biamby, Giscard, Quenum, Jerome, Gupta, Ritwik, Gonzalez, Joseph E., Darrell, Trevor, Chan, David M.
Large Multimodal Models (LMMs) have made significant strides in visual question-answering for single images. Recent advancements like long-context LMMs have allowed them to ingest larger, or even multiple, images. However, the ability to process a la
Externí odkaz:
http://arxiv.org/abs/2407.13766
Autor:
Moon, Suhong, Abdulhai, Marwa, Kang, Minwoo, Suh, Joseph, Soedarmadji, Widyadewi, Behar, Eran Kohen, Chan, David M.
Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects i
Externí odkaz:
http://arxiv.org/abs/2407.06576
Autor:
Chan, David, Vogeli, Chase
We compute the $RO(G)$-graded equivariant algebraic $K$-groups of a finite field with an action by its Galois group $G$. Specifically, we show these $K$-groups split as the sum of an explicitly computable term and the well-studied $RO(G)$-graded coef
Externí odkaz:
http://arxiv.org/abs/2406.19481
Publikováno v:
EPJ Web of Conferences, Vol 287, p 01008 (2023)
We review our recent progress on advanced silicon photonic devices and photonic circuits, including advanced grating couplers, modulators, mode and polarization division multiplexing and integrated optical signal processors for use in high capacity d
Externí odkaz:
https://doaj.org/article/aefffa8c793a4fb493f464a1a74666e0
Autor:
Petryk, Suzanne, Chan, David M., Kachinthaya, Anish, Zou, Haodi, Canny, John, Gonzalez, Joseph E., Darrell, Trevor
Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination,
Externí odkaz:
http://arxiv.org/abs/2404.02904
Autor:
Jain, Yash, Chan, David, Dheram, Pranav, Khare, Aparna, Shonibare, Olabanji, Ravichandran, Venkatesh, Ghosh, Shalini
Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks. Existing multi
Externí odkaz:
http://arxiv.org/abs/2403.19822
Autor:
Chen, Jian, Chen, Mingcong, Zhao, Qingxiang, Wang, Shuai, Wang, Yihe, Xiao, Ying, Hu, Jian, Chan, Danny Tat Ming, Yeung, Kam Tong Leo, Chan, David Yuen Chung, Liu, Hongbin
Publikováno v:
IEEE International Conference on Robotics & Automation, 2024
Traditional rigid endoscopes have challenges in flexibly treating tumors located deep in the brain, and low operability and fixed viewing angles limit its development. This study introduces a novel dual-segment flexible robotic endoscope MicroNeuro,
Externí odkaz:
http://arxiv.org/abs/2402.09679