Zobrazeno 1 - 10
of 921
pro vyhledávání: '"ZHANG Jianbing"'
Publikováno v:
Cailiao gongcheng, Vol 51, Iss 1, Pp 171-178 (2023)
In order to prepare low expansion, high strength and light weight composites, ZrW2O8-Cf/E51 composites were prepared by compression molding method, and the effects of ultrasonic time on its microstructure, thermal expansion behavior and ultimate tens
Externí odkaz:
https://doaj.org/article/5a524f164eca4a3289fccdb1c8c08577
Chain-of-thought (CoT) has proven to improve the reasoning capability of large language models (LLMs). However, due to the complexity of multimodal scenarios and the difficulty in collecting high-quality CoT data, CoT reasoning in multimodal LLMs has
Externí odkaz:
http://arxiv.org/abs/2411.00855
Autor:
Dong, Zehao, Zhang, Yang, Chiu, Chun-Chien, Lu, Sicheng, Zhang, Jianbing, Liu, Yu-Chen, Liu, Suya, Yang, Jan-Chi, Yu, Pu, Wang, Yayu, Chen, Zhen
Real-space imaging of three-dimensional atomic structures is a critical yet challenging task in materials science. Although scanning transmission electron microscopy has achieved sub-angstrom lateral resolution through techniques like electron ptycho
Externí odkaz:
http://arxiv.org/abs/2406.04252
Contrastive Language-Image Pre-training (CLIP) has shown powerful zero-shot learning performance. Few-shot learning aims to further enhance the transfer capability of CLIP by giving few images in each class, aka 'few shots'. Most existing methods eit
Externí odkaz:
http://arxiv.org/abs/2404.09778
Relation extraction is a critical task in the field of natural language processing with numerous real-world applications. Existing research primarily focuses on monolingual relation extraction or cross-lingual enhancement for relation extraction. Yet
Externí odkaz:
http://arxiv.org/abs/2403.15696
Autor:
Ma, Zheng, Wang, Changxin, Ouyang, Yawen, Zhao, Fei, Zhang, Jianbing, Huang, Shujian, Chen, Jiajun
Evaluating the compatibility between textual descriptions and corresponding images represents a core endeavor within multi-modal research. In recent years, a proliferation of reference-free methods, leveraging visual-language pre-trained models (VLMs
Externí odkaz:
http://arxiv.org/abs/2402.11572
Autor:
Xing, Shangyu, Zhao, Fei, Wu, Zhen, An, Tuo, Chen, Weihao, Li, Chunhui, Zhang, Jianbing, Dai, Xinyu
Multimodal large language models (MLLMs) have attracted increasing attention in the past few years, but they may still generate descriptions that include objects not present in the corresponding images, a phenomenon known as object hallucination. To
Externí odkaz:
http://arxiv.org/abs/2402.09801
Autor:
Cheng, Kanzhi, Sun, Qiushi, Chu, Yougang, Xu, Fangzhi, Li, Yantao, Zhang, Jianbing, Wu, Zhiyong
Graphical User Interface (GUI) agents are designed to automate complex tasks on digital devices, such as smartphones and desktops. Most existing GUI agents interact with the environment through extracted structured data, which can be notably lengthy
Externí odkaz:
http://arxiv.org/abs/2401.10935
Publikováno v:
In Proceedings of the 31st ACM International Conference on Multimedia, pp. 4564-4573. 2023
Extracting relational facts from multimodal data is a crucial task in the field of multimedia and knowledge graphs that feeds into widespread real-world applications. The emphasis of recent studies centers on recognizing relational facts in which bot
Externí odkaz:
http://arxiv.org/abs/2312.09753
Autor:
Pan, Mianzhi, Li, Jianfei, Yu, Mingyue, Ma, Zheng, Cheng, Kanzhi, Zhang, Jianbing, Chen, Jiajun
Commonsense reasoning, the ability to make logical assumptions about daily scenes, is one core intelligence of human beings. In this work, we present a novel task and dataset for evaluating the ability of text-to-image generative models to conduct co
Externí odkaz:
http://arxiv.org/abs/2312.07294