Abstrakt: |
To address the challenge of rapid geometric model development in the digital twin industry, this paper presents a comprehensive pipeline for constructing 3D models from images using monocular vision imaging principles. Firstly, a structure-from-motion (SFM) algorithm generates a 3D point cloud from photographs. The feature detection methods scale-invariant feature transform (SIFT), speeded-up robust features (SURF), and KAZE are compared across six datasets, with SIFT proving the most effective (matching rate higher than 0.12). Using K-nearest-neighbor matching and random sample consensus (RANSAC), refined feature point matching and 3D spatial representation are achieved via antipodal geometry. Then, the Poisson surface reconstruction algorithm converts the point cloud into a mesh model. Additionally, texture images are enhanced by leveraging a visual geometry group (VGG) network-based deep learning approach. Content images from a dataset provide geometric contours via higher-level VGG layers, while textures from style images are extracted using the lower-level layers. These are fused to create texture-transferred images, where the image quality assessment (IQA) metrics SSIM and PSNR are used to evaluate texture-enhanced images. Finally, texture mapping integrates the enhanced textures with the mesh model, improving the scene representation with enhanced texture. The method presented in this paper surpassed a LiDAR-based reconstruction approach by 20 % in terms of point cloud density and number of model facets, while the hardware cost was only 1 % of that associated with LiDAR. [ABSTRACT FROM AUTHOR] |