Zobrazeno 1 - 10
of 824
pro vyhledávání: '"Yang, YuHang"'
Controllable human image animation aims to generate videos from reference images using driving videos. Due to the limited control signals provided by sparse guidance (e.g., skeleton pose), recent works have attempted to introduce additional dense con
Externí odkaz:
http://arxiv.org/abs/2412.09349
$f(Q)$ and $f(T)$ gravity are based on fundamentally different geometric frameworks, yet they exhibit many similar properties. In this article, we identify two types of background-dependent and classical correspondences between these two theories of
Externí odkaz:
http://arxiv.org/abs/2412.01104
GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding
Open-Vocabulary 3D object affordance grounding aims to anticipate ``action possibilities'' regions on 3D objects with arbitrary instructions, which is crucial for robots to generically perceive real scenarios and respond to operational changes. Exist
Externí odkaz:
http://arxiv.org/abs/2411.19626
While vision-language models like CLIP have shown remarkable success in open-vocabulary tasks, their application is currently confined to image-level tasks, and they still struggle with dense predictions. Recent works often attribute such deficiency
Externí odkaz:
http://arxiv.org/abs/2411.15851
Autor:
Gong, Zheng, Deng, Zhuo, Gao, Weihao, Zhou, Wenda, Yang, Yuhang, Zhao, Hanqing, Niu, Zhiyuan, Shao, Lei, Wei, Wenbin, Ma, Lan
Cataract is one of the most common blinding eye diseases and can be treated by surgery. However, because cataract patients may also suffer from other blinding eye diseases, ophthalmologists must diagnose them before surgery. The cloudy lens of catara
Externí odkaz:
http://arxiv.org/abs/2411.12278
Autor:
Su, Aofeng, Wang, Aowen, Ye, Chao, Zhou, Chen, Zhang, Ga, Chen, Gang, Zhu, Guangcheng, Wang, Haobo, Xu, Haokai, Chen, Hao, Li, Haoze, Lan, Haoxuan, Tian, Jiaming, Yuan, Jing, Zhao, Junbo, Zhou, Junlin, Shou, Kaizhe, Zha, Liangyu, Long, Lin, Li, Liyao, Wu, Pengzuo, Zhang, Qi, Huang, Qingyi, Yang, Saisai, Zhang, Tao, Ye, Wentao, Zhu, Wufang, Hu, Xiaomeng, Gu, Xijun, Sun, Xinjie, Li, Xiang, Yang, Yuhang, Xiao, Zhiqing
The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI applications, presenting vast new opportunities across industries. Yet, the integration of tabular data remains notably underdeveloped, despite its foundational role in numero
Externí odkaz:
http://arxiv.org/abs/2411.02059
High-quality video generation, encompassing text-to-video (T2V), image-to-video (I2V), and video-to-video (V2V) generation, holds considerable significance in content creation to benefit anyone express their inherent creativity in new ways and world
Externí odkaz:
http://arxiv.org/abs/2410.05227
Grounding 3D scene affordance aims to locate interactive regions in 3D environments, which is crucial for embodied agents to interact intelligently with their surroundings. Most existing approaches achieve this by mapping semantics to 3D instances ba
Externí odkaz:
http://arxiv.org/abs/2409.19650
Channel knowledge map (CKM) is a promising technology to enable environment-aware wireless communications and sensing. Link state map (LSM) is one particular type of CKM that aims to learn the location-specific line-of-sight (LoS) link probability be
Externí odkaz:
http://arxiv.org/abs/2409.00016
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric HOI, in addition to perceiving semantics e.g., ''what'' interaction
Externí odkaz:
http://arxiv.org/abs/2405.13659