Zobrazeno 1 - 10
of 6 426
pro vyhledávání: '"Zhang,Pan"'
Autor:
Lu, Kaixuan, Zhang, Ruiqian, Huang, Xiao, Xie, Yuxing, Ning, Xiaogang, Zhang, Hanchao, Yuan, Mengke, Zhang, Pan, Wang, Tao, Liao, Tongkui
Recent self-supervised learning (SSL) methods have demonstrated impressive results in learning visual representations from unlabeled remote sensing images. However, most remote sensing images predominantly consist of scenographic scenes containing mu
Externí odkaz:
http://arxiv.org/abs/2411.06091
Autor:
Liu, Ziyu, Zang, Yuhang, Dong, Xiaoyi, Zhang, Pan, Cao, Yuhang, Duan, Haodong, He, Conghui, Xiong, Yuanjun, Lin, Dahua, Wang, Jiaqi
Visual preference alignment involves training Large Vision-Language Models (LVLMs) to predict human preferences between visual inputs. This is typically achieved by using labeled datasets of chosen/rejected pairs and employing optimization algorithms
Externí odkaz:
http://arxiv.org/abs/2410.17637
Autor:
Xing, Long, Huang, Qidong, Dong, Xiaoyi, Lu, Jiajie, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, He, Conghui, Wang, Jiaqi, Wu, Feng, Lin, Dahua
In large vision-language models (LVLMs), images serve as inputs that carry a wealth of information. As the idiom "A picture is worth a thousand words" implies, representing a single image in current LVLMs can require hundreds or even thousands of tok
Externí odkaz:
http://arxiv.org/abs/2410.17247
Autor:
Ding, Shuangrui, Qian, Rui, Dong, Xiaoyi, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, Guo, Yuwei, Lin, Dahua, Wang, Jiaqi
The Segment Anything Model 2 (SAM 2) has emerged as a powerful foundation model for object segmentation in both images and videos, paving the way for various downstream video applications. The crucial design of SAM 2 for video segmentation is its mem
Externí odkaz:
http://arxiv.org/abs/2410.16268
Autor:
Huang, Qidong, Dong, Xiaoyi, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, Wang, Jiaqi, Lin, Dahua, Zhang, Weiming, Yu, Nenghai
We present the Modality Integration Rate (MIR), an effective, robust, and generalized metric to indicate the multi-modal pre-training quality of Large Vision Language Models (LVLMs). Large-scale pre-training plays a critical role in building capable
Externí odkaz:
http://arxiv.org/abs/2410.07167
Autor:
Bu, Jiazi, Ling, Pengyang, Zhang, Pan, Wu, Tong, Dong, Xiaoyi, Zang, Yuhang, Cao, Yuhang, Lin, Dahua, Wang, Jiaqi
The text-to-video (T2V) generation models, offering convenient visual creation, have recently garnered increasing attention. Despite their substantial potential, the generated videos may present artifacts, including structural implausibility, tempora
Externí odkaz:
http://arxiv.org/abs/2410.06241
Estimating free energy is a fundamental problem in statistical mechanics. Recently, machine-learning-based methods, particularly the variational autoregressive networks (VANs), have been proposed to minimize variational free energy and to approximate
Externí odkaz:
http://arxiv.org/abs/2409.20029
Disordered lattice spin systems are crucial in both theoretical and applied physics. However, understanding their properties poses significant challenges for Monte Carlo simulations. In this work, we investigate the two-dimensional random-bond Ising
Externí odkaz:
http://arxiv.org/abs/2409.06538
Autor:
Xu, Linqiang, Zhao, Liya, Lau, Chit Siong, Zhang, Pan, Xu, Lianqiang, Li, Qiuhui, Fang, Shibo, Ang, Yee Sin, Sun, Xiaotian, Lu, Jing
Wide bandgap oxide semiconductors are very promising channel candidates for next-generation electronics due to their large-area manufacturing, high-quality dielectrics, low contact resistance, and low leakage current. However, the absence of ultra-sh
Externí odkaz:
http://arxiv.org/abs/2408.07339
In this study, we obtain specific picture of the phase transitions for the 6-dimensional Gauss-Bonnet Anti-de Sitter (AdS) black hole with triple phases, using the generalized free energy we constructed and Kramers escape rate in stochastic motion. T
Externí odkaz:
http://arxiv.org/abs/2407.20512