Výsledky vyhledávání - "Yang, YuHang"

Report

DisPose: Disentangling Pose Guidance for Controllable Human Image Animation

Autor: Li, Hongxiang, Li, Yaowei, Yang, Yuhang, Cao, Junjie, Zhu, Zhihong, Cheng, Xuxin, Chen, Long

Controllable human image animation aims to generate videos from reference images using driving videos. Due to the limited control signals provided by sparse guidance (e.g., skeleton pose), recent works have attempted to introduce additional dense con

Externí odkaz: http://arxiv.org/abs/2412.09349

Zobrazit plný text záznamu

Report

Background-dependent and classical correspondences between $f(Q)$ and $f(T)$ gravity

Autor: Wu, Cheng, Ren, Xin, Yang, Yuhang, Hu, Yu-Min, Saridakis, Emmanuel N.

$f(Q)$ and $f(T)$ gravity are based on fundamentally different geometric frameworks, yet they exhibit many similar properties. In this article, we identify two types of background-dependent and classical correspondences between these two theories of

Externí odkaz: http://arxiv.org/abs/2412.01104

Zobrazit plný text záznamu

Report

GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

Autor: Shao, Yawen, Zhai, Wei, Yang, Yuhang, Luo, Hongchen, Cao, Yang, Zha, Zheng-Jun

Open-Vocabulary 3D object affordance grounding aims to anticipate ``action possibilities'' regions on 3D objects with arbitrary instructions, which is crucial for robots to generically perceive real scenarios and respond to operational changes. Exist

Externí odkaz: http://arxiv.org/abs/2411.19626

Zobrazit plný text záznamu

Report

ResCLIP: Residual Attention for Training-free Dense Vision-language Inference

Autor: Yang, Yuhang, Deng, Jinhong, Li, Wen, Duan, Lixin

While vision-language models like CLIP have shown remarkable success in open-vocabulary tasks, their application is currently confined to image-level tasks, and they still struggle with dense predictions. Recent works often attribute such deficiency

Externí odkaz: http://arxiv.org/abs/2411.15851

Zobrazit plný text záznamu

Report

Versatile Cataract Fundus Image Restoration Model Utilizing Unpaired Cataract and High-quality Images

Autor: Gong, Zheng, Deng, Zhuo, Gao, Weihao, Zhou, Wenda, Yang, Yuhang, Zhao, Hanqing, Niu, Zhiyuan, Shao, Lei, Wei, Wenbin, Ma, Lan

Cataract is one of the most common blinding eye diseases and can be treated by surgery. However, because cataract patients may also suffer from other blinding eye diseases, ophthalmologists must diagnose them before surgery. The cloudy lens of catara

Externí odkaz: http://arxiv.org/abs/2411.12278

Zobrazit plný text záznamu

Report

TableGPT2: A Large Multimodal Model with Tabular Data Integration

The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI applications, presenting vast new opportunities across industries. Yet, the integration of tabular data remains notably underdeveloped, despite its foundational role in numero

Externí odkaz: http://arxiv.org/abs/2411.02059

Zobrazit plný text záznamu

Report

The Dawn of Video Generation: Preliminary Explorations with SORA-like Models

Autor: Zeng, Ailing, Yang, Yuhang, Chen, Weidong, Liu, Wei

High-quality video generation, encompassing text-to-video (T2V), image-to-video (I2V), and video-to-video (V2V) generation, holds considerable significance in content creation to benefit anyone express their inherent creativity in new ways and world

Externí odkaz: http://arxiv.org/abs/2410.05227

Zobrazit plný text záznamu

Report

Grounding 3D Scene Affordance From Egocentric Interactions

Autor: Liu, Cuiyu, Zhai, Wei, Yang, Yuhang, Luo, Hongchen, Liang, Sen, Cao, Yang, Zha, Zheng-Jun

Grounding 3D scene affordance aims to locate interactive regions in 3D environments, which is crucial for embodied agents to interact intelligently with their surroundings. Most existing approaches achieve this by mapping semantics to 3D instances ba

Externí odkaz: http://arxiv.org/abs/2409.19650

Zobrazit plný text záznamu

Report

Channel Knowledge Map for Cellular-Connected UAV via Binary Bayesian Filtering

Autor: Yang, Yuhang, Xu, Xiaoli, Zeng, Yong, Sun, Haijian, Hu, Rose Qingyang

Channel knowledge map (CKM) is a promising technology to enable environment-aware wireless communications and sensing. Link state map (LSM) is one particular type of CKM that aims to learn the location-specific line-of-sight (LoS) link probability be

Externí odkaz: http://arxiv.org/abs/2409.00016

Zobrazit plný text záznamu

Report

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Autor: Yang, Yuhang, Zhai, Wei, Wang, Chengfeng, Yu, Chengjun, Cao, Yang, Zha, Zheng-Jun

Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric HOI, in addition to perceiving semantics e.g., ''what'' interaction

Externí odkaz: http://arxiv.org/abs/2405.13659

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání