Zobrazeno 1 - 10
of 2 102
pro vyhledávání: '"Xiong Gang"'
Zero-Shot Composed Image Retrieval (ZS-CIR) supports diverse tasks with a broad range of visual content manipulation intentions that can be related to domain, scene, object, and attribute. A key challenge for ZS-CIR is to accurately map image represe
Externí odkaz:
http://arxiv.org/abs/2410.17393
Autor:
Cheng, Jie, Qiao, Ruixi, Xiong, Gang, Miao, Qinghai, Ma, Yingwei, Li, Binhua, Li, Yongbin, Lv, Yisheng
A significant aspiration of offline reinforcement learning (RL) is to develop a generalist agent with high capabilities from large and heterogeneous datasets. However, prior approaches that scale offline RL either rely heavily on expert trajectories
Externí odkaz:
http://arxiv.org/abs/2410.00564
Current text-video retrieval methods mainly rely on cross-modal matching between queries and videos to calculate their similarity scores, which are then sorted to obtain retrieval results. This method considers the matching between each candidate vid
Externí odkaz:
http://arxiv.org/abs/2408.11432
Knowledge-based visual question answering requires external knowledge beyond visible content to answer the question correctly. One limitation of existing methods is that they focus more on modeling the inter-modal and intra-modal correlations, which
Externí odkaz:
http://arxiv.org/abs/2408.07989
Autor:
Qu, Xiangyan, Yu, Jing, Gai, Keke, Zhuang, Jiamin, Tang, Yuanmin, Xiong, Gang, Gou, Gaopeng, Wu, Qi
Recent work shows that documents from encyclopedias serve as helpful auxiliary information for zero-shot learning. Existing methods align the entire semantics of a document with corresponding images to transfer knowledge. However, they disregard that
Externí odkaz:
http://arxiv.org/abs/2407.15613
Publikováno v:
网络与信息安全学报, Vol 6, Iss 6, Pp 88-96 (2020)
The development of network threat shows the characteristics of initiative, concealment and ubiquity. It poses a severe challenge to the passive, local and isolated traditional network defense mode. In view of the new trend of integration of big data
Externí odkaz:
https://doaj.org/article/e8c200cbcd5d456bbb4a591ab4596664
Autor:
Yue, Tongtian, Cheng, Jie, Guo, Longteng, Dai, Xingyuan, Zhao, Zijia, He, Xingjian, Xiong, Gang, Lv, Yisheng, Liu, Jing
Recent trends in Large Vision Language Models (LVLMs) research have been increasingly focusing on advancing beyond general image understanding towards more nuanced, object-level referential comprehension. In this paper, we present and delve into the
Externí odkaz:
http://arxiv.org/abs/2403.13263
Preference-based Reinforcement Learning (PbRL) circumvents the need for reward engineering by harnessing human preferences as the reward signal. However, current PbRL methods excessively depend on high-quality feedback from domain experts, which resu
Externí odkaz:
http://arxiv.org/abs/2402.17257
Publikováno v:
Jixie qiangdu, Vol 39, Pp 1204-1209 (2017)
Considering the uncertainty of the load,the fracture toughness,the yield strength,the tensile strength,and based on the failure assessment diagram technique( FAD),as well as combining with the finite element analysis results of crane structure,th
Externí odkaz:
https://doaj.org/article/9652b956265e4ab38fa636980b76c649
Recent advances in vision-language pre-trained models (VLPs) have significantly increased visual understanding and cross-modal analysis capabilities. Companies have emerged to provide multi-modal Embedding as a Service (EaaS) based on VLPs (e.g., CLI
Externí odkaz:
http://arxiv.org/abs/2311.05863