Zobrazeno 1 - 10
of 188
pro vyhledávání: '"Chen, Daoyuan"'
High-performance Multimodal Large Language Models (MLLMs) rely heavily on data quality. This study introduces a novel dataset named Img-Diff, designed to enhance fine-grained image recognition in MLLMs by leveraging insights from contrastive learning
Externí odkaz:
http://arxiv.org/abs/2408.04594
The emergence of large-scale multi-modal generative models has drastically advanced artificial intelligence, introducing unprecedented levels of performance and functionality. However, optimizing these models remains challenging due to historically i
Externí odkaz:
http://arxiv.org/abs/2407.11784
Autor:
Qin, Zhen, Chen, Daoyuan, Zhang, Wenhao, Yao, Liuyi, Huang, Yilun, Ding, Bolin, Li, Yaliang, Deng, Shuiguang
The rapid development of large language models (LLMs) has been witnessed in recent years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from text to a broader spectrum of domains, attracting widespread attention due to the
Externí odkaz:
http://arxiv.org/abs/2407.08583
Large language models exhibit exceptional generalization capabilities, primarily attributed to the utilization of diversely sourced data. However, conventional practices in integrating this diverse data heavily rely on heuristic schemes, lacking theo
Externí odkaz:
http://arxiv.org/abs/2405.14908
Emotional Support Conversation (ESC) systems are pivotal in providing empathetic interactions, aiding users through negative emotional states by understanding and addressing their unique experiences. In this paper, we tackle two key challenges in ESC
Externí odkaz:
http://arxiv.org/abs/2404.02505
Data visualization serves as a critical means for presenting data and mining its valuable insights. The task of chart summarization, through natural language processing techniques, facilitates in-depth data analysis of charts. However, there still ar
Externí odkaz:
http://arxiv.org/abs/2403.11236
Autor:
Gao, Dawei, Li, Zitao, Pan, Xuchen, Kuang, Weirui, Ma, Zhijian, Qian, Bingchen, Wei, Fei, Zhang, Wenhao, Xie, Yuexiang, Chen, Daoyuan, Yao, Liuyi, Peng, Hongyi, Zhang, Zeyu, Zhu, Lin, Cheng, Chen, Shi, Hongzhu, Li, Yaliang, Ding, Bolin, Zhou, Jingren
With the rapid advancement of Large Language Models (LLMs), significant progress has been made in multi-agent applications. However, the complexities in coordinating agents' cooperation and LLMs' erratic performance pose notable challenges in develop
Externí odkaz:
http://arxiv.org/abs/2402.14034
Federated Learning (FL) has recently been applied to the parameter-efficient fine-tuning of Large Language Models (LLMs). While promising, it raises significant challenges due to the heterogeneous resources and data distributions of clients. This stu
Externí odkaz:
http://arxiv.org/abs/2402.11505
The confluence of Federated Learning (FL) and Large Language Models (LLMs) is ushering in a new era in privacy-preserving natural language processing. However, the intensive memory requirements for fine-tuning LLMs pose significant challenges, especi
Externí odkaz:
http://arxiv.org/abs/2402.05926
Despite the impressive capabilities of Multimodal Large Language Models (MLLMs) in integrating text and image modalities, challenges remain in accurately interpreting detailed visual elements. This paper presents an empirical study on enhancing MLLMs
Externí odkaz:
http://arxiv.org/abs/2401.17981