Výsledky vyhledávání

Report

Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence

Autor: He, Jinghan, Zhu, Kuan, Guo, Haiyun, Fang, Junfeng, Hua, Zhenglin, Jia, Yuheng, Tang, Ming, Chua, Tat-Seng, Wang, Jinqiao

Large vision-language models (LVLMs) have made substantial progress in integrating large language models (LLMs) with visual inputs, enabling advanced multimodal reasoning. Despite their success, a persistent challenge is hallucination-where generated

Externí odkaz: http://arxiv.org/abs/2412.13949

Zobrazit plný text záznamu

Report

Monocular Lane Detection Based on Deep Learning: A Survey

Autor: He, Xin, Guo, Haiyun, Zhu, Kuan, Zhu, Bingke, Zhao, Xu, Fang, Jianwu, Wang, Jinqiao

Lane detection plays an important role in autonomous driving perception systems. As deep learning algorithms gain popularity, monocular lane detection methods based on them have demonstrated superior performance and emerged as a key research directio

Externí odkaz: http://arxiv.org/abs/2411.16316

Zobrazit plný text záznamu

Report

Out-Of-Distribution Detection with Diversification (Provably)

Autor: Yao, Haiyun, Han, Zongbo, Fu, Huazhu, Peng, Xi, Hu, Qinghua, Zhang, Changqing

Out-of-distribution (OOD) detection is crucial for ensuring reliable deployment of machine learning models. Recent advancements focus on utilizing easily accessible auxiliary outliers (e.g., data from the web or other datasets) in training. However,

Externí odkaz: http://arxiv.org/abs/2411.14049

Zobrazit plný text záznamu

Report

SEEKR: Selective Attention-Guided Knowledge Retention for Continual Learning of Large Language Models

Autor: He, Jinghan, Guo, Haiyun, Zhu, Kuan, Zhao, Zihan, Tang, Ming, Wang, Jinqiao

Continual learning (CL) is crucial for language models to dynamically adapt to the evolving real-world demands. To mitigate the catastrophic forgetting problem in CL, data replay has been proven a simple and effective strategy, and the subsequent dat

Externí odkaz: http://arxiv.org/abs/2411.06171

Zobrazit plný text záznamu

Report

Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents

Autor: Sun, Bolun, Zhou, Yifan, Jiang, Haiyun

This paper presents a novel application of large language models (LLMs) to enhance user comprehension of privacy policies through an interactive dialogue agent. We demonstrate that LLMs significantly outperform traditional models in tasks like Data P

Externí odkaz: http://arxiv.org/abs/2410.11906

Zobrazit plný text záznamu

Report

CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language Models

Autor: Li, Yijiang, Gao, Qingying, Sun, Haoran, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin

Are Multi-modal Large Language Models (MLLMs) stochastic parrots? Do they genuinely understand? This paper aims to explore the core cognitive abilities that human intelligence builds upon to perceive, comprehend, and reason in MLLMs. To this end, we

Externí odkaz: http://arxiv.org/abs/2410.10855

Zobrazit plný text záznamu

Report

Universally Optimal Watermarking Schemes for LLMs: from Theory to Practice

Autor: He, Haiyun, Liu, Yepeng, Wang, Ziqiao, Mao, Yongyi, Bu, Yuheng

Large Language Models (LLMs) boosts human efficiency but also poses misuse risks, with watermarking serving as a reliable method to differentiate AI-generated content from human-created text. In this work, we propose a novel theoretical framework for

Externí odkaz: http://arxiv.org/abs/2410.02890

Zobrazit plný text záznamu

Report

Vision Language Models Know Law of Conservation without Understanding More-or-Less

Autor: Luo, Dezhi, Lyu, Haiyun, Gao, Qingying, Sun, Haoran, Li, Yijiang, Deng, Hokin

Conservation is a critical milestone of cognitive development considered to be supported by both the understanding of quantitative concepts and the reversibility of mental operations. To assess whether this critical component of human intelligence ha

Externí odkaz: http://arxiv.org/abs/2410.00332

Zobrazit plný text záznamu

Report

Vision Language Models See What You Want but not What You See

Autor: Gao, Qingying, Li, Yijiang, Lyu, Haiyun, Sun, Haoran, Luo, Dezhi, Deng, Hokin

Knowing others' intentions and taking others' perspectives are two core components of human intelligence that are typically considered to be instantiations of theory-of-mind. Infiltrating machines with these abilities is an important step towards bui

Externí odkaz: http://arxiv.org/abs/2410.00324

Zobrazit plný text záznamu

Report

Probing Mechanical Reasoning in Large Vision Language Models

Autor: Sun, Haoran, Gao, Qingying, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin, Li, Yijiang

Mechanical reasoning is a fundamental ability that sets human intelligence apart from other animal intelligence. Mechanical reasoning allows us to design tools, build bridges and canals, and construct houses which set the foundation of human civiliza

Externí odkaz: http://arxiv.org/abs/2410.00318

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání