Zobrazeno 1 - 10
of 207
pro vyhledávání: '"WANG Zhenzhi"'
Autor:
Li, Yixuan, Ran, Xingjian, Xu, Linning, Lu, Tao, Yu, Mulin, Wang, Zhenzhi, Xiangli, Yuanbo, Lin, Dahua, Dai, Bo
Buildings are primary components of cities, often featuring repeated elements such as windows and doors. Traditional 3D building asset creation is labor-intensive and requires specialized skills to develop design rules. Recent generative models for b
Externí odkaz:
http://arxiv.org/abs/2412.07660
Autor:
Wang, Zhenzhi, Li, Yixuan, Zeng, Yanhong, Fang, Youqing, Guo, Yuwei, Liu, Wenran, Tan, Jing, Chen, Kai, Xue, Tianfan, Dai, Bo, Lin, Dahua
Human image animation involves generating videos from a character photo, allowing user control and unlocking the potential for video and movie production. While recent approaches yield impressive results using high-quality training data, the inaccess
Externí odkaz:
http://arxiv.org/abs/2407.17438
Text-conditioned motion synthesis has made remarkable progress with the emergence of diffusion models. However, the majority of these motion diffusion models are primarily designed for a single character and overlook multi-human interactions. In our
Externí odkaz:
http://arxiv.org/abs/2311.15864
Neural radiance fields (NeRF) and its subsequent variants have led to remarkable progress in neural rendering. While most of recent neural rendering works focus on objects and small-scale scenes, developing neural rendering methods for city-scale sce
Externí odkaz:
http://arxiv.org/abs/2309.16553
Publikováno v:
In Gas Science and Engineering November 2024 131
Autor:
Wang, Xianglong, Pan, Jienan, Jin, Yi, Du, Xuetian, Wang, Zhenzhi, Cheng, Nannan, Hou, Quanlin
Publikováno v:
In Fuel 15 March 2025 384
Multi-modal Ads Video Understanding Challenge is the first grand challenge aiming to comprehensively understand ads videos. Our challenge includes two tasks: video structuring in the temporal dimension and multi-modal video classification. It asks th
Externí odkaz:
http://arxiv.org/abs/2109.07951
Temporal grounding aims to localize a video moment which is semantically aligned with a given natural language query. Existing methods typically apply a detection or regression pipeline on the fused representation with the research focus on designing
Externí odkaz:
http://arxiv.org/abs/2109.04872
Autor:
Xia, Daping, Niu, Yunxia, Tian, Jixian, Su, Xianbo, Wei, Guoqin, Jian, Kuo, Wang, Zhenzhi, Zhang, Yawei, Zhao, Weizhong
Publikováno v:
In Fuel 15 May 2024 364
Spatio-temporal action detection is an important and challenging problem in video understanding. The existing action detection benchmarks are limited in aspects of small numbers of instances in a trimmed video or low-level atomic actions. This paper
Externí odkaz:
http://arxiv.org/abs/2105.07404