Zobrazeno 1 - 10
of 484
pro vyhledávání: '"WANG Junyang"'
Publikováno v:
Cailiao gongcheng, Vol 52, Iss 2, Pp 172-179 (2024)
A γ′ phase strengthened NiCoCrFeAlTiMoW alloy was prepared in a vacuum arc melting furnace, and X-ray diffractometer (XRD), scanning electron microscope (SEM), energy dispersive spectrometer (EDS) and tensile testing machine were used to investiga
Externí odkaz:
https://doaj.org/article/669d4719af86478891bb0ea3baea2d1c
Autor:
Wang, Junyang, Xu, Haiyang, Jia, Haitao, Zhang, Xi, Yan, Ming, Shen, Weizhou, Zhang, Ji, Huang, Fei, Sang, Jitao
Mobile device operation tasks are increasingly becoming a popular multi-modal AI application scenario. Current Multi-modal Large Language Models (MLLMs), constrained by their training data, lack the capability to function effectively as operation ass
Externí odkaz:
http://arxiv.org/abs/2406.01014
Autor:
Wang, Junyang, Xu, Haiyang, Ye, Jiabo, Yan, Ming, Shen, Weizhou, Zhang, Ji, Huang, Fei, Sang, Jitao
Mobile device agent based on Multimodal Large Language Models (MLLM) is becoming a popular application. In this paper, we introduce Mobile-Agent, an autonomous multi-modal mobile device agent. Mobile-Agent first leverages visual perception tools to a
Externí odkaz:
http://arxiv.org/abs/2401.16158
Autor:
Wang, Junyang, Wang, Yuhang, Xu, Guohai, Zhang, Jing, Gu, Yukai, Jia, Haitao, Wang, Jiaqi, Xu, Haiyang, Yan, Ming, Zhang, Ji, Sang, Jitao
Despite making significant progress in multi-modal tasks, current Multi-modal Large Language Models (MLLMs) encounter the significant challenge of hallucinations, which may lead to harmful consequences. Therefore, evaluating MLLMs' hallucinations is
Externí odkaz:
http://arxiv.org/abs/2311.07397
Autor:
Wang, Junyang, Zhou, Yiyang, Xu, Guohai, Shi, Pengcheng, Zhao, Chenlin, Xu, Haiyang, Ye, Qinghao, Yan, Ming, Zhang, Ji, Zhu, Jihua, Sang, Jitao, Tang, Haoyu
Large Vision-Language Models (LVLMs) have recently achieved remarkable success. However, LVLMs are still plagued by the hallucination problem, which limits the practicality in many scenarios. Hallucination refers to the information of LVLMs' response
Externí odkaz:
http://arxiv.org/abs/2308.15126
Autor:
Shi, Pengcheng, Zhang, Jie, Cheng, Haozhe, Wang, Junyang, Zhou, Yiyang, Zhao, Chenlin, Zhu, Jihua
Point cloud registration is a fundamental problem in many domains. Practically, the overlap between point clouds to be registered may be relatively small. Most unsupervised methods lack effective initial evaluation of overlap, leading to suboptimal r
Externí odkaz:
http://arxiv.org/abs/2308.09364
Machine learning models often learn to make predictions that rely on sensitive social attributes like gender and race, which poses significant fairness risks, especially in societal applications, such as hiring, banking, and criminal justice. Existin
Externí odkaz:
http://arxiv.org/abs/2308.08482
Autor:
Ye, Qinghao, Xu, Haiyang, Xu, Guohai, Ye, Jiabo, Yan, Ming, Zhou, Yiyang, Wang, Junyang, Hu, Anwen, Shi, Pengcheng, Shi, Yaya, Li, Chenliang, Xu, Yuanhong, Chen, Hehong, Tian, Junfeng, Qian, Qi, Zhang, Ji, Huang, Fei, Zhou, Jingren
Large language models (LLMs) have demonstrated impressive zero-shot abilities on a variety of open-ended tasks, while recent research has also explored the use of LLMs for multi-modal generation. In this study, we introduce mPLUG-Owl, a novel trainin
Externí odkaz:
http://arxiv.org/abs/2304.14178
With the development of Vision-Language Pre-training Models (VLPMs) represented by CLIP and ALIGN, significant breakthroughs have been achieved for association-based visual tasks such as image classification and image-text retrieval by the zero-shot
Externí odkaz:
http://arxiv.org/abs/2304.13273
Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with limited training examples. While the problem of catastrophic forgett
Externí odkaz:
http://arxiv.org/abs/2304.01489