Zobrazeno 1 - 10
of 715
pro vyhledávání: '"Li, YiNing"'
Multi-objective Markov Decision Processes (MDPs) are receiving increasing attention, as real-world decision-making problems often involve conflicting objectives that cannot be addressed by a single-objective MDP. The Pareto front identifies the set o
Externí odkaz:
http://arxiv.org/abs/2410.15557
Autor:
Zhang, Xiaofei, Li, Yining, Wang, Jinping, Qin, Xiangyi, Shen, Ying, Fan, Zhengping, Tan, Xiaojun
Perception systems of autonomous vehicles are susceptible to occlusion, especially when examined from a vehicle-centric perspective. Such occlusion can lead to overlooked object detections, e.g., larger vehicles such as trucks or buses may create bli
Externí odkaz:
http://arxiv.org/abs/2407.21581
Significant focus has been placed on integrating large language models (LLMs) with various tools in developing general-purpose agents. This poses a challenge to LLMs' tool-use capabilities. However, there are evident gaps between existing tool-use ev
Externí odkaz:
http://arxiv.org/abs/2407.08713
Whole-body pose estimation is a challenging task that requires simultaneous prediction of keypoints for the body, hands, face, and feet. Whole-body pose estimation aims to predict fine-grained pose information for the human body, including the face,
Externí odkaz:
http://arxiv.org/abs/2407.08634
Pansharpening aims to generate a high spatial resolution multispectral image (HRMS) by fusing a low spatial resolution multispectral image (LRMS) and a panchromatic image (PAN). The most challenging issue for this task is that only the to-be-fused LR
Externí odkaz:
http://arxiv.org/abs/2407.06633
Autor:
Liu, Jiajun, Ke, Wenjun, Wang, Peng, Wang, Jiahao, Gao, Jinhua, Shang, Ziyu, Li, Guozheng, Xu, Zijie, Ji, Ke, Li, Yining
Continual Knowledge Graph Embedding (CKGE) aims to efficiently learn new knowledge and simultaneously preserve old knowledge. Dominant approaches primarily focus on alleviating catastrophic forgetting of old knowledge but neglect efficient learning f
Externí odkaz:
http://arxiv.org/abs/2407.05705
Autor:
Zhang, Pan, Dong, Xiaoyi, Zang, Yuhang, Cao, Yuhang, Qian, Rui, Chen, Lin, Guo, Qipeng, Duan, Haodong, Wang, Bin, Ouyang, Linke, Zhang, Songyang, Zhang, Wenwei, Li, Yining, Gao, Yang, Sun, Peng, Zhang, Xinyue, Li, Wei, Li, Jingwen, Wang, Wenhai, Yan, Hang, He, Conghui, Zhang, Xingcheng, Chen, Kai, Dai, Jifeng, Qiao, Yu, Lin, Dahua, Wang, Jiaqi
We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities
Externí odkaz:
http://arxiv.org/abs/2407.03320
Autor:
Chen, Yicheng, Li, Xiangtai, Li, Yining, Zeng, Yanhong, Wu, Jianzong, Zhao, Xiangyu, Chen, Kai
Diffusion models can generate realistic and diverse images, potentially facilitating data availability for data-intensive perception tasks. However, leveraging these models to boost performance on downstream tasks with synthetic data poses several ch
Externí odkaz:
http://arxiv.org/abs/2406.20085
Multi-modal large language models (MLLMs) have made significant strides in various visual understanding tasks. However, the majority of these models are constrained to process low-resolution images, which limits their effectiveness in perception task
Externí odkaz:
http://arxiv.org/abs/2406.17770
Autor:
Wu, Jianzong, Li, Xiangtai, Zeng, Yanhong, Zhang, Jiangning, Zhou, Qianyu, Li, Yining, Tong, Yunhai, Chen, Kai
In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements. By leveraging a few images of a specific object, we efficiently fine-tune a text-to-v
Externí odkaz:
http://arxiv.org/abs/2406.17758