Zobrazeno 1 - 10
of 704
pro vyhledávání: '"Zhao Dongyan"'
Autor:
Wang, Yueqian, Meng, Xiaojun, Wang, Yuxuan, Liang, Jianxin, Wei, Jiansheng, Zhang, Huishuai, Zhao, Dongyan
Recent researches on video large language models (VideoLLM) predominantly focus on model architectures and training datasets, leaving the interaction format between the user and the model under-explored. In existing works, users often interact with V
Externí odkaz:
http://arxiv.org/abs/2411.17991
Autor:
Wu, Pengfei, Liu, Jiahao, Gong, Zhuocheng, Wang, Qifan, Li, Jinpeng, Wang, Jingang, Cai, Xunliang, Zhao, Dongyan
Publikováno v:
NLPCC2024
Recent advancements in Large Language Models (LLMs) have shown remarkable performance across a wide range of tasks. Despite this, the auto-regressive nature of LLM decoding, which generates only a single token per forward propagation, fails to fully
Externí odkaz:
http://arxiv.org/abs/2410.20488
Hallucination is a common issue in Multimodal Large Language Models (MLLMs), yet the underlying principles remain poorly understood. In this paper, we investigate which components of MLLMs contribute to object hallucinations. To analyze image represe
Externí odkaz:
http://arxiv.org/abs/2409.01151
Autor:
Yuan, Danlong, Liu, Jiahao, Li, Bei, Zhang, Huishuai, Wang, Jingang, Cai, Xunliang, Zhao, Dongyan
While the Mamba architecture demonstrates superior inference efficiency and competitive performance on short-context natural language processing (NLP) tasks, empirical evidence suggests its capacity to comprehend long contexts is limited compared to
Externí odkaz:
http://arxiv.org/abs/2408.15496
To address the hallucination in generative question answering (GQA) where the answer can not be derived from the document, we propose a novel evidence-enhanced triplet generation framework, EATQA, encouraging the model to predict all the combinations
Externí odkaz:
http://arxiv.org/abs/2408.15037
Autor:
Du, Haowei, Zhao, Dongyan
In-context learning (ICL) of large language models (LLMs) has attracted increasing attention in the community where LLMs make predictions only based on instructions augmented with a few examples. Existing example selection methods for ICL utilize spa
Externí odkaz:
http://arxiv.org/abs/2408.13028
Autor:
Du, Haowei, Zhao, Dongyan
Recent works have attempted to integrate external knowledge into LLMs to address the limitations and potential factual errors in LLM-generated content. However, how to retrieve the correct knowledge from the large amount of external knowledge imposes
Externí odkaz:
http://arxiv.org/abs/2408.12979
Autor:
Gong, Zhuocheng, Liu, Jiahao, Wang, Ziyue, Wu, Pengfei, Wang, Jingang, Cai, Xunliang, Zhao, Dongyan, Yan, Rui
Speculative decoding has emerged as a promising technique to accelerate the inference of Large Language Models (LLMs) by employing a small language model to draft a hypothesis sequence, which is then validated by the LLM. The effectiveness of this ap
Externí odkaz:
http://arxiv.org/abs/2407.16207
Video Question Answering (VideoQA) has emerged as a challenging frontier in the field of multimedia processing, requiring intricate interactions between visual and textual modalities. Simply uniformly sampling frames or indiscriminately aggregating f
Externí odkaz:
http://arxiv.org/abs/2407.15047
Autor:
Gong, Zhuocheng, Lv, Ang, Guan, Jian, Yan, Junxi, Wu, Wei, Zhang, Huishuai, Huang, Minlie, Zhao, Dongyan, Yan, Rui
Is it always necessary to compute tokens from shallow to deep layers in Transformers? The continued success of vanilla Transformers and their variants suggests an undoubted "yes". In this work, however, we attempt to break the depth-ordered conventio
Externí odkaz:
http://arxiv.org/abs/2407.06677