Zobrazeno 1 - 10
of 238
pro vyhledávání: '"Xu, Yinghao"'
Autor:
Xu, Mengda, Xu, Zhenjia, Xu, Yinghao, Chi, Cheng, Wetzstein, Gordon, Veloso, Manuela, Song, Shuran
We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real-world manipulation skills without the need of real-world robot training data. The key idea behind Im2Flow2Act is to use object flow as the manipulation interfac
Externí odkaz:
http://arxiv.org/abs/2407.15208
Autor:
Zhang, Qihang, Xu, Yinghao, Wang, Chaoyang, Lee, Hsin-Ying, Wetzstein, Gordon, Zhou, Bolei, Yang, Ceyuan
Scene image editing is crucial for entertainment, photography, and advertising design. Existing methods solely focus on either 2D individual object or 3D global scene editing. This results in a lack of a unified approach to effectively control and ma
Externí odkaz:
http://arxiv.org/abs/2405.18424
Autor:
Kuang, Zhengfei, Cai, Shengqu, He, Hao, Xu, Yinghao, Li, Hongsheng, Guibas, Leonidas, Wetzstein, Gordon
Research on video generation has recently made tremendous progress, enabling high-quality videos to be generated from text prompts or images. Adding control to the video generation process is an important goal moving forward and recent approaches tha
Externí odkaz:
http://arxiv.org/abs/2405.17414
Controllability plays a crucial role in video generation since it allows users to create desired content. However, existing models largely overlooked the precise control of camera pose that serves as a cinematic language to express deeper narrative n
Externí odkaz:
http://arxiv.org/abs/2404.02101
Autor:
Xu, Yinghao, Shi, Zifan, Yifan, Wang, Chen, Hansheng, Yang, Ceyuan, Peng, Sida, Shen, Yujun, Wetzstein, Gordon
We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from sparse-view images in around 0.1s. GRM is a feed-forward transformer-based model that efficiently incorporates multi-view information to translate the input pixels in
Externí odkaz:
http://arxiv.org/abs/2403.14621
Autor:
Bai, Qingyan, Shi, Zifan, Xu, Yinghao, Ouyang, Hao, Wang, Qiuyu, Yang, Ceyuan, Wang, Xuan, Wetzstein, Gordon, Shen, Yujun, Chen, Qifeng
This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner. To this end, a lightweight module is distilled from a 3D portrait generator
Externí odkaz:
http://arxiv.org/abs/2402.14000
Autor:
Zhang, Qihang, Wang, Chaoyang, Siarohin, Aliaksandr, Zhuang, Peiye, Xu, Yinghao, Yang, Ceyuan, Lin, Dahua, Zhou, Bolei, Tulyakov, Sergey, Lee, Hsin-Ying
We are witnessing significant breakthroughs in the technology for generating 3D objects from text. Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets. Gener
Externí odkaz:
http://arxiv.org/abs/2312.08885
Autor:
Cheng, Ka Leong, Wang, Qiuyu, Shi, Zifan, Zheng, Kecheng, Xu, Yinghao, Ouyang, Hao, Chen, Qifeng, Shen, Yujun
Neural radiance fields, which represent a 3D scene as a color field and a density field, have demonstrated great progress in novel view synthesis yet are unfavorable for editing due to the implicitness. In view of such a deficiency, we propose to rep
Externí odkaz:
http://arxiv.org/abs/2312.06657
Generating large-scale 3D scenes cannot simply apply existing 3D object synthesis technique since 3D scenes usually hold complex spatial configurations and consist of a number of objects at varying scales. We thus propose a practical and efficient 3D
Externí odkaz:
http://arxiv.org/abs/2312.02136
Autor:
Wang, Peng, Tan, Hao, Bi, Sai, Xu, Yinghao, Luan, Fujun, Sunkavalli, Kalyan, Wang, Wenping, Xu, Zexiang, Zhang, Kai
We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1.3 seconds on a single A100 GPU. PF-LR
Externí odkaz:
http://arxiv.org/abs/2311.12024