T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Autor:	Zhang, Shijie, Jiang, Boyan, He, Keke, Zhu, Junwei, Tai, Ying, Wang, Chengjie, Zhang, Yinda, Fu, Yanwei
Rok vydání:	2024
Předmět:	Computer Science - Computer Vision and Pattern Recognition
Druh dokumentu:	Working Paper
Popis:	Pixel2Mesh (P2M) is a classical approach for reconstructing 3D shapes from a single color image through coarse-to-fine mesh deformation. Although P2M is capable of generating plausible global shapes, its Graph Convolution Network (GCN) often produces overly smooth results, causing the loss of fine-grained geometry details. Moreover, P2M generates non-credible features for occluded regions and struggles with the domain gap from synthetic data to real-world images, which is a common challenge for single-view 3D reconstruction methods. To address these challenges, we propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M. Specifically, we use a global Transformer to control the holistic shape and a local Transformer to progressively refine the local geometry details with graph-based point upsampling. To enhance real-world reconstruction, we present the simple yet effective Linear Scale Search (LSS), which serves as prompt tuning during the input preprocessing. Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability. Comment: Received by ICASSP 2024
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2403.13663 Zobrazit plný text záznamu View this record from Arxiv