Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Biamby, Giscard"'
Autor:
Wu, Tsung-Han, Biamby, Giscard, Quenum, Jerome, Gupta, Ritwik, Gonzalez, Joseph E., Darrell, Trevor, Chan, David M.
Large Multimodal Models (LMMs) have made significant strides in visual question-answering for single images. Recent advancements like long-context LMMs have allowed them to ingest larger, or even multiple, images. However, the ability to process a la
Externí odkaz:
http://arxiv.org/abs/2407.13766
Autor:
Niu, Dantong, Sharma, Yuvan, Biamby, Giscard, Quenum, Jerome, Bai, Yutong, Shi, Baifeng, Darrell, Trevor, Herzig, Roei
In recent years, instruction-tuned Large Multimodal Models (LMMs) have been successful at several tasks, including image captioning and visual question answering; yet leveraging these models remains an open question for robotics. Prior LMMs for robot
Externí odkaz:
http://arxiv.org/abs/2406.11815
Autor:
Wu, Tsung-Han, Biamby, Giscard, Chan, David, Dunlap, Lisa, Gupta, Ritwik, Wang, Xudong, Gonzalez, Joseph E., Darrell, Trevor
Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image.
Externí odkaz:
http://arxiv.org/abs/2312.08366
We demonstrate how language can improve geolocation: the task of predicting the location where an image was taken. Here we study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features human
Externí odkaz:
http://arxiv.org/abs/2211.15521
Detecting out-of-context media, such as "mis-captioned" images on Twitter, is a relevant problem, especially in domains of high public significance. In this work we aim to develop defenses against such misinformation for the topics of Climate Change,
Externí odkaz:
http://arxiv.org/abs/2112.08594
Autor:
Laielli, Michael, Biamby, Giscard, Chen, Dian, Gupta, Ritwik, Loeffler, Adam, Nguyen, Phat Dat, Luo, Ross, Darrell, Trevor, Ebrahimi, Sayna
Active learning for object detection is conventionally achieved by applying techniques developed for classification in a way that aggregates individual detections into image-level selection criteria. This is typically coupled with the costly assumpti
Externí odkaz:
http://arxiv.org/abs/2108.09186
Autor:
Ebrahimi, Sayna, Gan, William, Chen, Dian, Biamby, Giscard, Salahi, Kamyar, Laielli, Michael, Zhu, Shizhan, Darrell, Trevor
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use
Externí odkaz:
http://arxiv.org/abs/2012.10467