Zobrazeno 1 - 10
of 720
pro vyhledávání: '"Lin, Tsung yi"'
Meshes are fundamental representations of 3D surfaces. However, creating high-quality meshes is a labor-intensive task that requires significant time and expertise in 3D modeling. While a delicate object often requires over $10^4$ faces to be accurat
Externí odkaz:
http://arxiv.org/abs/2412.09548
Autor:
NVIDIA, Bala, Maciej, Cui, Yin, Ding, Yifan, Ge, Yunhao, Hao, Zekun, Hasselgren, Jon, Huffman, Jacob, Jin, Jingyi, Lewis, J. P., Li, Zhaoshuo, Lin, Chen-Hsuan, Lin, Yen-Chen, Lin, Tsung-Yi, Liu, Ming-Yu, Luo, Alice, Ma, Qianli, Munkberg, Jacob, Shi, Stella, Wei, Fangyin, Xiang, Donglai, Xu, Jiashu, Zeng, Xiaohui, Zhang, Qinsheng
We introduce Edify 3D, an advanced solution designed for high-quality 3D asset generation. Our method first synthesizes RGB and surface normal images of the described object at multiple viewpoints using a diffusion model. The multi-view observations
Externí odkaz:
http://arxiv.org/abs/2411.07135
Existing automatic captioning methods for visual content face challenges such as lack of detail, content hallucination, and poor instruction following. In this work, we propose VisualFactChecker (VFC), a flexible training-free pipeline that generates
Externí odkaz:
http://arxiv.org/abs/2404.19752
Autor:
Lorraine, Jonathan, Xie, Kevin, Zeng, Xiaohui, Lin, Chen-Hsuan, Takikawa, Towaki, Sharp, Nicholas, Lin, Tsung-Yi, Liu, Ming-Yu, Fidler, Sanja, Lucas, James
Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to c
Externí odkaz:
http://arxiv.org/abs/2306.07349
Recent advancements in diffusion models have greatly improved the quality and diversity of synthesized content. To harness the expressive power of diffusion models, researchers have explored various controllable mechanisms that allow users to intuiti
Externí odkaz:
http://arxiv.org/abs/2304.14404
Autor:
Lin, Chen-Hsuan, Gao, Jun, Tang, Luming, Takikawa, Towaki, Zeng, Xiaohui, Huang, Xun, Kreis, Karsten, Fidler, Sanja, Liu, Ming-Yu, Lin, Tsung-Yi
DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results. However, the method has two inherent limitations: (a) extre
Externí odkaz:
http://arxiv.org/abs/2211.10440
This paper summarizes model improvements and inference-time optimizations for the popular anchor-based detectors in the scenes of autonomous driving. Based on the high-performing RCNN-RS and RetinaNet-RS detection frameworks designed for common detec
Externí odkaz:
http://arxiv.org/abs/2208.06062
Although neural radiance fields (NeRF) have shown impressive advances for novel view synthesis, most methods typically require multiple input images of the same scene with accurate camera poses. In this work, we seek to substantially reduce the input
Externí odkaz:
http://arxiv.org/abs/2207.05736
While language tasks are naturally expressed in a single, unified, modeling framework, i.e., generating sequences of tokens, this has not been the case in computer vision. As a result, there is a proliferation of distinct architectures and loss funct
Externí odkaz:
http://arxiv.org/abs/2206.07669
Autor:
Yen-Chen, Lin, Florence, Pete, Barron, Jonathan T., Lin, Tsung-Yi, Rodriguez, Alberto, Isola, Phillip
Thin, reflective objects such as forks and whisks are common in our daily lives, but they are particularly challenging for robot perception because it is hard to reconstruct them using commodity RGB-D cameras or multi-view stereo techniques. While tr
Externí odkaz:
http://arxiv.org/abs/2203.01913