Zobrazeno 1 - 10
of 45
pro vyhledávání: '"Han, Zongbo"'
Autor:
Han, Zongbo, Yang, Jialong, Li, Junfan, Hu, Qinghua, Xu, Qianli, Shou, Mike Zheng, Zhang, Changqing
Vision-language foundation models (e.g., CLIP) have shown remarkable performance across a wide range of tasks. However, deploying these models may be unreliable when significant distribution gaps exist between the training and test data. The training
Externí odkaz:
http://arxiv.org/abs/2409.19375
Autor:
Zou, Ke, Lin, Tian, Han, Zongbo, Wang, Meng, Yuan, Xuedong, Chen, Haoyu, Zhang, Changqing, Shen, Xiaojing, Fu, Huazhu
Multi-modal ophthalmic image classification plays a key role in diagnosing eye diseases, as it integrates information from different sources to complement their respective performances. However, recent improvements have mainly focused on accuracy, of
Externí odkaz:
http://arxiv.org/abs/2405.18167
Autor:
Bai, Zechen, Wang, Pichao, Xiao, Tianjun, He, Tong, Han, Zongbo, Zhang, Zheng, Shou, Mike Zheng
This survey presents a comprehensive analysis of the phenomenon of hallucination in multimodal large language models (MLLMs), also known as Large Vision-Language Models (LVLMs), which have demonstrated significant advancements and remarkable abilitie
Externí odkaz:
http://arxiv.org/abs/2404.18930
Autor:
Zhang, Qingyang, Wei, Yake, Han, Zongbo, Fu, Huazhu, Peng, Xi, Deng, Cheng, Hu, Qinghua, Xu, Cai, Wen, Jie, Hu, Di, Zhang, Changqing
Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical diagnosis. However,
Externí odkaz:
http://arxiv.org/abs/2404.18947
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance. This problem usually arises due to the overfitting problem, which is characterized by learning everything presented in the training set
Externí odkaz:
http://arxiv.org/abs/2402.08384
Recent advancements in large vision-language models (LVLMs) have demonstrated impressive capability in visual information understanding with human language. Despite these advances, LVLMs still face challenges with multimodal hallucination, such as ge
Externí odkaz:
http://arxiv.org/abs/2402.01345
Publikováno v:
CVPR 2024
Out-of-distribution (OOD) detection methods often exploit auxiliary outliers to train model identifying OOD samples, especially discovering challenging outliers from auxiliary outliers dataset to improve OOD detection. However, they may still face li
Externí odkaz:
http://arxiv.org/abs/2311.15243
Mixup is a well-established data augmentation technique, which can extend the training distribution and regularize the neural networks by creating ''mixed'' samples based on the label-equivariance assumption, i.e., a proportional mixup of the input d
Externí odkaz:
http://arxiv.org/abs/2308.06451
Classifying incomplete multi-view data is inevitable since arbitrary view missing widely exists in real-world applications. Although great progress has been achieved, existing incomplete multi-view methods are still difficult to obtain a trustworthy
Externí odkaz:
http://arxiv.org/abs/2304.05165
Autor:
Han, Zongbo, Liang, Zhipeng, Yang, Fan, Liu, Liu, Li, Lanqing, Bian, Yatao, Zhao, Peilin, Hu, Qinghua, Wu, Bingzhe, Zhang, Changqing, Yao, Jianhua
Subpopulation shift exists widely in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions. Ignoring subpopulation shifts may lead to
Externí odkaz:
http://arxiv.org/abs/2304.04148