Výsledky vyhledávání

Report

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Autor: Lu, Yujie, Jiang, Dongfu, Chen, Wenhu, Wang, William Yang, Choi, Yejin, Lin, Bill Yuchen

Recent breakthroughs in vision-language models (VLMs) emphasize the necessity of benchmarking human preferences in real-world multimodal interactions. To address this gap, we launched WildVision-Arena (WV-Arena), an online platform that collects huma

Externí odkaz: http://arxiv.org/abs/2406.11069

Zobrazit plný text záznamu

Report

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

Autor: He, Xuehai, Feng, Weixi, Zheng, Kaizhi, Lu, Yujie, Zhu, Wanrong, Li, Jiachen, Fan, Yue, Wang, Jianfeng, Li, Linjie, Yang, Zhengyuan, Lin, Kevin, Wang, William Yang, Wang, Lijuan, Wang, Xin Eric

Multimodal Language Language Models (MLLMs) demonstrate the emerging abilities of "world models" -- interpreting and reasoning about complex real-world dynamics. To assess these abilities, we posit videos are the ideal medium, as they encapsulate ric

Externí odkaz: http://arxiv.org/abs/2406.08407

Zobrazit plný text záznamu

Report

Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

Autor: Fu, Xingyu, He, Muyu, Lu, Yujie, Wang, William Yang, Roth, Dan

We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that align with commonsense in real life, which we call Commonsense-T2I. Given two adversarial text prompts containing an ident

Externí odkaz: http://arxiv.org/abs/2406.07546

Zobrazit plný text záznamu

Report

From Text to Pixel: Advancing Long-Context Understanding in MLLMs

Autor: Lu, Yujie, Li, Xiujun, Fu, Tsu-Jui, Eckstein, Miguel, Wang, William Yang

The rapid progress in Multimodal Large Language Models (MLLMs) has significantly advanced their ability to process and understand complex visual and textual information. However, the integration of multiple images and extensive textual contexts remai

Externí odkaz: http://arxiv.org/abs/2405.14213

Zobrazit plný text záznamu

Akademický článek

Remote monitoring of stored grain insect pests

Autor: Wang, Dianxuan, Bai, Chunqi, Li, Hui, Lu, Yujie, Guo, Xu

Publikováno v: Julius-Kühn-Archiv, Vol 463, Iss 1, Pp 239-245 (2018)

A number of remote sensing methods were developed and tested in commercial grain warehouses; probe pitfall traps attached to vacuum lines, surface pit fall traps equipped with video cameras and white boards on grain surface monitored with video camer

Externí odkaz: https://doaj.org/article/5ca7828ba0f84360b237cf6468b03bc2

Zobrazit plný text záznamu

Report

Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)

Autor: Saxon, Michael, Jahara, Fatima, Khoshnoodi, Mahsa, Lu, Yujie, Sharma, Aditya, Wang, William Yang

With advances in the quality of text-to-image (T2I) models has come interest in benchmarking their prompt faithfulness-the semantic coherence of generated images to the prompts they were conditioned on. A variety of T2I faithfulness metrics have been

Externí odkaz: http://arxiv.org/abs/2404.04251

Zobrazit plný text záznamu

Report

Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes

Autor: Lu, Yujie, Wan, Long, Ding, Nayu, Wang, Yulong, Shen, Shuhan, Cai, Shen, Gao, Lin

Neural implicit representation of geometric shapes has witnessed considerable advancements in recent years. However, common distance field based implicit representations, specifically signed distance field (SDF) for watertight shapes or unsigned dist

Externí odkaz: http://arxiv.org/abs/2403.01414

Zobrazit plný text záznamu

Akademický článek

Rapid detection of phosphine resistance in the lesser grain borer, Rhyzopertha dominica (Coleoptera: Bostrychidae) from China using ARMS-PCR

Autor: Lu, Yujie, Zhang, Chenguang, Wang, Zhenyan, Yan, Xiaoping, Emery, Robert N.

Publikováno v: Julius-Kühn-Archiv, Vol 463, Iss 2, Pp 1043-1045 (2018)

The lesser grain borer, Rhyzopertha dominica is one of the serious cosmopolitan stored grain pests worldwide. High phosphine resistant R. dominica has been reported in several countries. The evolution of strong phosphine resistance is a major challen

Externí odkaz: https://doaj.org/article/aadc5f99034947849778eb6852d51f61

Zobrazit plný text záznamu

Report

Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?

Autor: Li, Xiujun, Lu, Yujie, Gan, Zhe, Gao, Jianfeng, Wang, William Yang, Choi, Yejin

Recent multimodal large language models (MLLMs) have shown promising instruction following capabilities on vision-language tasks. In this work, we introduce VISUAL MODALITY INSTRUCTION (VIM), and investigate how well multimodal models can understand

Externí odkaz: http://arxiv.org/abs/2311.17647

Zobrazit plný text záznamu

Report

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

Autor: Zhang, Xinlu, Lu, Yujie, Wang, Weizhi, Yan, An, Yan, Jun, Qin, Lianke, Wang, Heng, Yan, Xifeng, Wang, William Yang, Petzold, Linda Ruth

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details. Although GPT-4V has shown promising results in various multi-modal tasks,

Externí odkaz: http://arxiv.org/abs/2311.01361

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání