Výsledky vyhledávání

Report

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Autor: Dang, Yunkai, Gao, Mengxi, Yan, Yibo, Zou, Xin, Gu, Yanggan, Liu, Aiwei, Hu, Xuming

Ensuring that Multimodal Large Language Models (MLLMs) maintain consistency in their responses is essential for developing trustworthy multimodal intelligence. However, existing benchmarks include many samples where all MLLMs \textit{exhibit high res

Externí odkaz: http://arxiv.org/abs/2411.02708

Zobrazit plný text záznamu

Report

MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models

Autor: Huang, Kaichen, Huo, Jiahao, Yan, Yibo, Wang, Kun, Yue, Yutao, Hu, Xuming

In recent years, multimodal large language models (MLLMs) have significantly advanced, integrating more modalities into diverse applications. However, the lack of explainability remains a major barrier to their use in scenarios requiring decision tra

Externí odkaz: http://arxiv.org/abs/2410.04819

Zobrazit plný text záznamu

Report

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality

Autor: Zhou, Guanyu, Yan, Yibo, Zou, Xin, Wang, Kun, Liu, Aiwei, Hu, Xuming

Multimodal Large Language Models (MLLMs) have emerged as a central focus in both industry and academia, but often suffer from biases introduced by visual and language priors, which can lead to multimodal hallucination. These biases arise from the vis

Externí odkaz: http://arxiv.org/abs/2410.04780

Zobrazit plný text záznamu

Report

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

Autor: Yan, Yibo, Wang, Shen, Huo, Jiahao, Li, Hang, Li, Boyan, Su, Jiamin, Gao, Xiong, Zhang, Yi-Fan, Xu, Tianlong, Chu, Zhendong, Zhong, Aoxiao, Wang, Kun, Xiong, Hui, Yu, Philip S., Hu, Xuming, Wen, Qingsong

As the field of Multimodal Large Language Models (MLLMs) continues to evolve, their potential to revolutionize artificial intelligence is particularly promising, especially in addressing mathematical reasoning tasks. Current mathematical benchmarks p

Externí odkaz: http://arxiv.org/abs/2410.04509

Zobrazit plný text záznamu

Report

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models

Autor: Zou, Xin, Wang, Yizhou, Yan, Yibo, Huang, Sirui, Zheng, Kening, Chen, Junkai, Tang, Chang, Hu, Xuming

Despite their impressive capabilities, Multimodal Large Language Models (MLLMs) are susceptible to hallucinations, especially assertively fabricating content not present in the visual inputs. To address the aforementioned challenge, we follow a commo

Externí odkaz: http://arxiv.org/abs/2410.03577

Zobrazit plný text záznamu

Report

GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding

Autor: Yan, Yibo, Lee, Joey

In human reading and communication, individuals tend to engage in geospatial reasoning, which involves recognizing geographic entities and making informed inferences about their interrelationships. To mimic such cognitive process, current methods eit

Externí odkaz: http://arxiv.org/abs/2408.11366

Zobrazit plný text záznamu

Report

Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models

Autor: Zheng, Kening, Chen, Junkai, Yan, Yibo, Zou, Xin, Hu, Xuming

Hallucination issues persistently plagued current multimodal large language models (MLLMs). While existing research primarily focuses on object-level or attribute-level hallucinations, sidelining the more sophisticated relation hallucinations that ne

Externí odkaz: http://arxiv.org/abs/2408.09429

Zobrazit plný text záznamu

Akademický článek

The temporal and spatial changes of the ecological environment quality of the urban agglomeration on the northern slope of Tianshan Mountain and the influencing factors

Autor: Yan Yibo, Chai Ziyuan, Yang xiaodong, Zibibula Simayi, Yang Shengtian

Publikováno v: Ecological Indicators, Vol 133, Iss , Pp 108380- (2021)

In the present study, the urban agglomeration on the northern slope of Tianshan Mountain (UANST) in Xinjiang was taken as the research object. Principal component analysis, coefficient of variation and analytic hierarchy process were used to analyse

Externí odkaz: https://doaj.org/article/2dfa941e037c44fa97352b4b1c13dbaa

Zobrazit plný text záznamu

Report

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

Autor: Zhu, Junyi, Liu, Shuochen, Yu, Yu, Tang, Bo, Yan, Yibo, Li, Zhiyu, Xiong, Feiyu, Xu, Tong, Blaschko, Matthew B.

Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to en

Externí odkaz: http://arxiv.org/abs/2406.16069

Zobrazit plný text záznamu

Report

MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model

Autor: Huo, Jiahao, Yan, Yibo, Hu, Boren, Yue, Yutao, Hu, Xuming

Projecting visual features into word embedding space has become a significant fusion strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal mechanisms have yet to be explored. Inspired by multilingual research, we identif

Externí odkaz: http://arxiv.org/abs/2406.11193

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání