Zobrazeno 1 - 10
of 10 658
pro vyhledávání: '"Chan-In M"'
The hypergraph Zarankiewicz's problem, introduced by Erd\H{o}s in 1964, asks for the maximum number of hyperedges in an $r$-partite hypergraph with $n$ vertices in each part that does not contain a copy of $K_{t,t,\ldots,t}$. Erd\H{o}s obtained a nea
Externí odkaz:
http://arxiv.org/abs/2412.06490
Autor:
Chan, David M., Corona, Rodolfo, Park, Joonyong, Cho, Cheol Jun, Bai, Yutong, Darrell, Trevor
With the introduction of transformer-based models for vision and language tasks, such as LLaVA and Chameleon, there has been renewed interest in the discrete tokenized representation of images. These models often treat image patches as discrete token
Externí odkaz:
http://arxiv.org/abs/2411.05001
The Automated Audio Captioning (AAC) task asks models to generate natural language descriptions of an audio input. Evaluating these machine-generated audio captions is a complex task that requires considering diverse factors, among them, auditory sce
Externí odkaz:
http://arxiv.org/abs/2409.12962
Autor:
Tulsiani, Hitesh, Chan, David M., Ghosh, Shalini, Lalwani, Garima, Pandey, Prabhat, Bansal, Ankish, Garimella, Sri, Rastrow, Ariya, Hoffmeister, Björn
Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each
Externí odkaz:
http://arxiv.org/abs/2409.10515
Assessing personality traits using large language models (LLMs) has emerged as an interesting and challenging area of research. While previous methods employ explicit questionnaires, often derived from the Big Five model of personality, we hypothesiz
Externí odkaz:
http://arxiv.org/abs/2409.09905
Autor:
Bhore, Sujoy, Chan, Timothy M.
We develop simple and general techniques to obtain faster (near-linear time) static approximation algorithms, as well as efficient dynamic data structures, for four fundamental geometric optimization problems: minimum piercing set (MPS), maximum inde
Externí odkaz:
http://arxiv.org/abs/2407.20659
Autor:
Wu, Tsung-Han, Biamby, Giscard, Quenum, Jerome, Gupta, Ritwik, Gonzalez, Joseph E., Darrell, Trevor, Chan, David M.
Large Multimodal Models (LMMs) have made significant strides in visual question-answering for single images. Recent advancements like long-context LMMs have allowed them to ingest larger, or even multiple, images. However, the ability to process a la
Externí odkaz:
http://arxiv.org/abs/2407.13766
Autor:
Moon, Suhong, Abdulhai, Marwa, Kang, Minwoo, Suh, Joseph, Soedarmadji, Widyadewi, Behar, Eran Kohen, Chan, David M.
Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects i
Externí odkaz:
http://arxiv.org/abs/2407.06576
Autor:
Petryk, Suzanne, Chan, David M., Kachinthaya, Anish, Zou, Haodi, Canny, John, Gonzalez, Joseph E., Darrell, Trevor
Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination,
Externí odkaz:
http://arxiv.org/abs/2404.02904
Autor:
Fardian-Melamed, Natalie, Skripka, Artiom, Lee, Changhwan, Ursprung, Benedikt, Darlington, Thomas P., Teitelboim, Ayelet, Qi, Xiao, Wang, Maoji, Gerton, Jordan M., Cohen, Bruce E., Chan, Emory M., Schuck, P. James
Mechanical force is an essential feature for many physical and biological processes.1-12 Remote measurement of mechanical signals with high sensitivity and spatial resolution is needed for diverse applications, including robotics,13 biophysics,14-20
Externí odkaz:
http://arxiv.org/abs/2404.02026