Výsledky vyhledávání - "Lin, Tsung yi"

Report

Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale

Autor: Hao, Zekun, Romero, David W., Lin, Tsung-Yi, Liu, Ming-Yu

Meshes are fundamental representations of 3D surfaces. However, creating high-quality meshes is a labor-intensive task that requires significant time and expertise in 3D modeling. While a delicate object often requires over $10^4$ faces to be accurat

Externí odkaz: http://arxiv.org/abs/2412.09548

Zobrazit plný text záznamu

Report

Edify 3D: Scalable High-Quality 3D Asset Generation

We introduce Edify 3D, an advanced solution designed for high-quality 3D asset generation. Our method first synthesizes RGB and surface normal images of the described object at multiple viewpoints using a diffusion model. The multi-view observations

Externí odkaz: http://arxiv.org/abs/2411.07135

Zobrazit plný text záznamu

Report

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Autor: Ge, Yunhao, Zeng, Xiaohui, Huffman, Jacob Samuel, Lin, Tsung-Yi, Liu, Ming-Yu, Cui, Yin

Existing automatic captioning methods for visual content face challenges such as lack of detail, content hallucination, and poor instruction following. In this work, we propose VisualFactChecker (VFC), a flexible training-free pipeline that generates

Externí odkaz: http://arxiv.org/abs/2404.19752

Zobrazit plný text záznamu

Report

ATT3D: Amortized Text-to-3D Object Synthesis

Autor: Lorraine, Jonathan, Xie, Kevin, Zeng, Xiaohui, Lin, Chen-Hsuan, Takikawa, Towaki, Sharp, Nicholas, Lin, Tsung-Yi, Liu, Ming-Yu, Fidler, Sanja, Lucas, James

Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to c

Externí odkaz: http://arxiv.org/abs/2306.07349

Zobrazit plný text záznamu

Report

Motion-Conditioned Diffusion Model for Controllable Video Synthesis

Autor: Chen, Tsai-Shien, Lin, Chieh Hubert, Tseng, Hung-Yu, Lin, Tsung-Yi, Yang, Ming-Hsuan

Recent advancements in diffusion models have greatly improved the quality and diversity of synthesized content. To harness the expressive power of diffusion models, researchers have explored various controllable mechanisms that allow users to intuiti

Externí odkaz: http://arxiv.org/abs/2304.14404

Zobrazit plný text záznamu

Report

Magic3D: High-Resolution Text-to-3D Content Creation

Autor: Lin, Chen-Hsuan, Gao, Jun, Tang, Luming, Takikawa, Towaki, Zeng, Xiaohui, Huang, Xun, Kreis, Karsten, Fidler, Sanja, Liu, Ming-Yu, Lin, Tsung-Yi

DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results. However, the method has two inherent limitations: (a) extre

Externí odkaz: http://arxiv.org/abs/2211.10440

Zobrazit plný text záznamu

Report

Optimizing Anchor-based Detectors for Autonomous Driving Scenes

Autor: Du, Xianzhi, Hung, Wei-Chih, Lin, Tsung-Yi

This paper summarizes model improvements and inference-time optimizations for the popular anchor-based detectors in the scenes of autonomous driving. Based on the high-performing RCNN-RS and RetinaNet-RS detection frameworks designed for common detec

Externí odkaz: http://arxiv.org/abs/2208.06062

Zobrazit plný text záznamu

Report

Vision Transformer for NeRF-Based View Synthesis from a Single Input Image

Autor: Lin, Kai-En, Yen-Chen, Lin, Lai, Wei-Sheng, Lin, Tsung-Yi, Shih, Yi-Chang, Ramamoorthi, Ravi

Although neural radiance fields (NeRF) have shown impressive advances for novel view synthesis, most methods typically require multiple input images of the same scene with accurate camera poses. In this work, we seek to substantially reduce the input

Externí odkaz: http://arxiv.org/abs/2207.05736

Zobrazit plný text záznamu

Report

A Unified Sequence Interface for Vision Tasks

Autor: Chen, Ting, Saxena, Saurabh, Li, Lala, Lin, Tsung-Yi, Fleet, David J., Hinton, Geoffrey

While language tasks are naturally expressed in a single, unified, modeling framework, i.e., generating sequences of tokens, this has not been the case in computer vision. As a result, there is a proliferation of distinct architectures and loss funct

Externí odkaz: http://arxiv.org/abs/2206.07669

Zobrazit plný text záznamu

Report

NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance Fields

Autor: Yen-Chen, Lin, Florence, Pete, Barron, Jonathan T., Lin, Tsung-Yi, Rodriguez, Alberto, Isola, Phillip

Thin, reflective objects such as forks and whisks are common in our daily lives, but they are particularly challenging for robot perception because it is hard to reconstruct them using commodity RGB-D cameras or multi-view stereo techniques. While tr

Externí odkaz: http://arxiv.org/abs/2203.01913

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání