Výsledky vyhledávání - "Plummer, Bryan A."

Report

Tell Me What's Next: Textual Foresight for Generic UI Representations

Autor: Burns, Andrea, Saenko, Kate, Plummer, Bryan A.

Mobile app user interfaces (UIs) are rich with action, text, structure, and image content that can be utilized to learn generic UI representations for tasks like automating user commands, summarizing content, and evaluating the accessibility of user

Externí odkaz: http://arxiv.org/abs/2406.07822

Zobrazit plný text záznamu

Report

SLANT: Spurious Logo ANalysis Toolkit

Autor: Qraitem, Maan, Teterwak, Piotr, Saenko, Kate, Plummer, Bryan A.

Online content is filled with logos, from ads and social media posts to website branding and product placements. Consequently, these logos are prevalent in the extensive web-scraped datasets used to pretrain Vision-Language Models, which are used for

Externí odkaz: http://arxiv.org/abs/2406.01449

Zobrazit plný text záznamu

Report

Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers

Autor: Pham, Chau, Plummer, Bryan A.

Multi-Channel Imaging (MCI) contains an array of challenges for encoding useful feature representations not present in traditional images. For example, images from two different satellites may both contain RGB channels, but the remaining channels can

Externí odkaz: http://arxiv.org/abs/2405.16419

Zobrazit plný text záznamu

Report

Koala: Key frame-conditioned long video-LLM

Autor: Tan, Reuben, Sun, Ximeng, Hu, Ping, Wang, Jui-hsien, Deilamsalehy, Hanieh, Plummer, Bryan A., Russell, Bryan, Saenko, Kate

Long video question answering is a challenging task that involves recognizing short-term activities and reasoning about their fine-grained relationships. State-of-the-art video Large Language Models (vLLMs) hold promise as a viable solution due to th

Externí odkaz: http://arxiv.org/abs/2404.04346

Zobrazit plný text záznamu

Report

Machine-Generated Text Localization

Autor: Zhang, Zhongping, Qin, Wenda, Plummer, Bryan A.

Machine-Generated Text (MGT) detection aims to identify a piece of text as machine or human written. Prior work has primarily formulated MGT detection as a binary classification task over an entire document, with limited work exploring cases where on

Externí odkaz: http://arxiv.org/abs/2402.11744

Zobrazit plný text záznamu

Report

Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks

Autor: Qraitem, Maan, Tasnim, Nazia, Teterwak, Piotr, Saenko, Kate, Plummer, Bryan A.

Typographic Attacks, which involve pasting misleading text onto an image, were noted to harm the performance of Vision-Language Models like CLIP. However, the susceptibility of recent Large Vision-Language Models to these attacks remains understudied

Externí odkaz: http://arxiv.org/abs/2402.00626

Zobrazit plný text záznamu

Report

UniHuman: A Unified Model for Editing Human Images in the Wild

Autor: Li, Nannan, Liu, Qing, Singh, Krishna Kumar, Wang, Yilin, Zhang, Jianming, Plummer, Bryan A., Lin, Zhe

Human image editing includes tasks like changing a person's pose, their clothing, or editing the image according to a text prompt. However, prior work often tackles these tasks separately, overlooking the benefit of mutual reinforcement from learning

Externí odkaz: http://arxiv.org/abs/2312.14985

Zobrazit plný text záznamu

Report

CLAMP: Contrastive LAnguage Model Prompt-tuning

Autor: Teterwak, Piotr, Sun, Ximeng, Plummer, Bryan A., Saenko, Kate, Lim, Ser-Nam

Large language models (LLMs) have emerged as powerful general-purpose interfaces for many machine learning problems. Recent work has adapted LLMs to generative visual tasks like image captioning, visual question answering, and visual chat, using a re

Externí odkaz: http://arxiv.org/abs/2312.01629

Zobrazit plný text záznamu

Report

Learning to Compose SuperWeights for Neural Parameter Allocation Search

Autor: Teterwak, Piotr, Nelson, Soren, Dryden, Nikoli, Bashkirova, Dina, Saenko, Kate, Plummer, Bryan A.

Neural parameter allocation search (NPAS) automates parameter sharing by obtaining weights for a network given an arbitrary, fixed parameter budget. Prior work has two major drawbacks we aim to address. First, there is a disconnect in the sharing pat

Externí odkaz: http://arxiv.org/abs/2312.01274

Zobrazit plný text záznamu

Report

A Unified Framework for Connecting Noise Modeling to Boost Noise Detection

Autor: Wang, Siqi, Pham, Chau, Plummer, Bryan A.

Noisy labels can impair model performance, making the study of learning with noisy labels an important topic. Two conventional approaches are noise modeling and noise detection. However, these two methods are typically studied independently, and ther

Externí odkaz: http://arxiv.org/abs/2312.00827

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání