Výsledky vyhledávání - "Wang, William"

Report

VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs

Autor: Wu, Qiucheng, Zhao, Handong, Saxon, Michael, Bui, Trung, Wang, William Yang, Zhang, Yang, Chang, Shiyu

Vision language models (VLMs) are an exciting emerging class of language models (LMs) that have merged classic LM capabilities with those of image processing systems. However, the ways that these capabilities combine are not always intuitive and warr

Externí odkaz: http://arxiv.org/abs/2407.01863

Zobrazit plný text záznamu

Report

Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts

Autor: Sharma, Aditya, Saxon, Michael, Wang, William Yang

We present LoCoVQA, a dynamic benchmark generator for evaluating long-context extractive reasoning in vision language models (VLMs). LoCoVQA augments test examples for mathematical reasoning, VQA, and character recognition tasks with increasingly lon

Externí odkaz: http://arxiv.org/abs/2406.16851

Zobrazit plný text záznamu

Report

DistiLRR: Transferring Code Repair for Low-Resource Programming Languages

Autor: Wong, Kyle, Amayuelas, Alfonso, Pan, Liangming, Wang, William Yang

Large language models (LLMs) have shown remarkable performance on code generation tasks. A recent application of LLMs for code generation is iterative code repair, where a model fixes an incorrect program by rationalizing about errors and generating

Externí odkaz: http://arxiv.org/abs/2406.14867

Zobrazit plný text záznamu

Report

MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

Autor: Amayuelas, Alfonso, Yang, Xianjun, Antoniades, Antonis, Hua, Wenyue, Pan, Liangming, Wang, William

Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models

Externí odkaz: http://arxiv.org/abs/2406.14711

Zobrazit plný text záznamu

Report

Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

Autor: Wang, Danqing, Antoniades, Antonis, Luong, Kha-Dinh, Zhang, Edwin, Kosan, Mert, Li, Jiachen, Singh, Ambuj, Wang, William Yang, Li, Lei

Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations

Externí odkaz: http://arxiv.org/abs/2406.13869

Zobrazit plný text záznamu

Report

BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM

Autor: Xu, Wenda, Li, Jiachen, Wang, William Yang, Li, Lei

Direct alignment from preferences (DAP) has emerged as a promising paradigm for aligning large language models (LLMs) to human desiderata from pre-collected, offline preference datasets. While recent studies indicate that existing offline DAP methods

Externí odkaz: http://arxiv.org/abs/2406.12168

Zobrazit plný text záznamu

Report

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Autor: Lu, Yujie, Jiang, Dongfu, Chen, Wenhu, Wang, William Yang, Choi, Yejin, Lin, Bill Yuchen

Recent breakthroughs in vision-language models (VLMs) emphasize the necessity of benchmarking human preferences in real-world multimodal interactions. To address this gap, we launched WildVision-Arena (WV-Arena), an online platform that collects huma

Externí odkaz: http://arxiv.org/abs/2406.11069

Zobrazit plný text záznamu

Report

TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

Autor: Feng, Weixi, Li, Jiachen, Saxon, Michael, Fu, Tsu-jui, Chen, Wenhu, Wang, William Yang

Video generation has many unique challenges beyond those of image generation. The temporal dimension introduces extensive possible variations across frames, over which consistency and continuity may be violated. In this study, we move beyond evaluati

Externí odkaz: http://arxiv.org/abs/2406.08656

Zobrazit plný text záznamu

Report

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

Autor: He, Xuehai, Feng, Weixi, Zheng, Kaizhi, Lu, Yujie, Zhu, Wanrong, Li, Jiachen, Fan, Yue, Wang, Jianfeng, Li, Linjie, Yang, Zhengyuan, Lin, Kevin, Wang, William Yang, Wang, Lijuan, Wang, Xin Eric

Multimodal Language Language Models (MLLMs) demonstrate the emerging abilities of "world models" -- interpreting and reasoning about complex real-world dynamics. To assess these abilities, we posit videos are the ideal medium, as they encapsulate ric

Externí odkaz: http://arxiv.org/abs/2406.08407

Zobrazit plný text záznamu

Report

Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

Autor: Fu, Xingyu, He, Muyu, Lu, Yujie, Wang, William Yang, Roth, Dan

We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that fit commonsense in real life, which we call Commonsense-T2I. Given two adversarial text prompts containing an identical se

Externí odkaz: http://arxiv.org/abs/2406.07546

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání