Zobrazeno 1 - 10
of 60
pro vyhledávání: '"Ge, Yunhao"'
Autor:
NVIDIA, Bala, Maciej, Cui, Yin, Ding, Yifan, Ge, Yunhao, Hao, Zekun, Hasselgren, Jon, Huffman, Jacob, Jin, Jingyi, Lewis, J. P., Li, Zhaoshuo, Lin, Chen-Hsuan, Lin, Yen-Chen, Lin, Tsung-Yi, Liu, Ming-Yu, Luo, Alice, Ma, Qianli, Munkberg, Jacob, Shi, Stella, Wei, Fangyin, Xiang, Donglai, Xu, Jiashu, Zeng, Xiaohui, Zhang, Qinsheng
We introduce Edify 3D, an advanced solution designed for high-quality 3D asset generation. Our method first synthesizes RGB and surface normal images of the described object at multiple viewpoints using a diffusion model. The multi-view observations
Externí odkaz:
http://arxiv.org/abs/2411.07135
Autor:
NVIDIA, Atzmon, Yuval, Bala, Maciej, Balaji, Yogesh, Cai, Tiffany, Cui, Yin, Fan, Jiaojiao, Ge, Yunhao, Gururani, Siddharth, Huffman, Jacob, Isaac, Ronald, Jannaty, Pooya, Karras, Tero, Lam, Grace, Lewis, J. P., Licata, Aaron, Lin, Yen-Chen, Liu, Ming-Yu, Ma, Qianli, Mallya, Arun, Martino-Tarr, Ashlee, Mendez, Doug, Nah, Seungjun, Pruett, Chris, Reda, Fitsum, Song, Jiaming, Wang, Ting-Chun, Wei, Fangyin, Zeng, Xiaohui, Zeng, Yu, Zhang, Qinsheng
We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image utilizes cascaded pixel-space diffusion models trained using a novel Laplacian diffusion process, in wh
Externí odkaz:
http://arxiv.org/abs/2411.07126
Autor:
Ge, Yunhao, Tang, Yihe, Xu, Jiashu, Gokmen, Cem, Li, Chengshu, Ai, Wensi, Martinez, Benjamin Jose, Aydin, Arman, Anvari, Mona, Chakravarthy, Ayush K, Yu, Hong-Xing, Wong, Josiah, Srivastava, Sanjana, Lee, Sharon, Zha, Shengxin, Itti, Laurent, Li, Yunzhu, Martín-Martín, Roberto, Liu, Miao, Zhang, Pengchuan, Zhang, Ruohan, Fei-Fei, Li, Wu, Jiajun
The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data gener
Externí odkaz:
http://arxiv.org/abs/2405.09546
Existing automatic captioning methods for visual content face challenges such as lack of detail, content hallucination, and poor instruction following. In this work, we propose VisualFactChecker (VFC), a flexible training-free pipeline that generates
Externí odkaz:
http://arxiv.org/abs/2404.19752
Autor:
Zhao, Brian Nlong, Xiao, Yuhang, Xu, Jiashu, Jiang, Xinyang, Yang, Yifan, Li, Dongsheng, Itti, Laurent, Vineet, Vibhav, Ge, Yunhao
The popularization of Text-to-Image (T2I) diffusion models enables the generation of high-quality images from text descriptions. However, generating diverse customized images with reference visual attributes remains challenging. This work focuses on
Externí odkaz:
http://arxiv.org/abs/2312.14216
Autor:
Ge, Yunhao, Yu, Hong-Xing, Zhao, Cheng, Guo, Yuliang, Huang, Xinyu, Ren, Liu, Itti, Laurent, Wu, Jiajun
A major challenge in monocular 3D object detection is the limited diversity and quantity of objects in real datasets. While augmenting real scenes with virtual objects holds promise to improve both the diversity and quantity of the objects, it remain
Externí odkaz:
http://arxiv.org/abs/2312.05277
We create a novel benchmark for evaluating a Deployable Lifelong Learning system for Visual Reinforcement Learning (RL) that is pretrained on a curated dataset, and propose a novel Scalable Lifelong Learning system capable of retaining knowledge from
Externí odkaz:
http://arxiv.org/abs/2311.13648
We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples training data generation into foregr
Externí odkaz:
http://arxiv.org/abs/2309.05956
Continual learning aims to emulate the human ability to continually accumulate knowledge over sequential tasks. The main challenge is to maintain performance on previously learned tasks after learning new tasks, i.e., to avoid catastrophic forgetting
Externí odkaz:
http://arxiv.org/abs/2307.11386
Autor:
Schiappa, Madeline Chantry, Azad, Shehreen, VS, Sachidanand, Ge, Yunhao, Miksik, Ondrej, Rawat, Yogesh S., Vineet, Vibhav
Due to the increase in computational resources and accessibility of data, an increase in large, deep learning models trained on copious amounts of multi-modal data using self-supervised or semi-supervised learning have emerged. These ``foundation'' m
Externí odkaz:
http://arxiv.org/abs/2306.09278