RGB-D Semantic Segmentation and Label-Oriented Voxelgrid Fusion for Accurate 3D Semantic Mapping

Autor: Jingwei Xu, Xianshun Wang, Dongchen Zhu, Wenjun Shi, Guanghui Zhang, Xiaolin Zhanga, Jiamao Li
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Circuits and Systems for Video Technology. 32:183-197
ISSN: 1558-2205
1051-8215
DOI: 10.1109/tcsvt.2021.3056726
Popis: The 3D semantic map plays an increasingly important role in a wide variety of applications, especially for many kinds of task-driven robots. In this paper, we present a semantic mapping methodology for 3D semantic map obtaining from RGB-D scans. In contrast to existing methods that use 3D annotated information as supervisory, we focus on accurate 2D frame labeling and combine labels in 3D space using semantic fusion mechanism. For scene parsing, a two-stream network with a novel discriminatory mask loss is proposed to explore sufficient extraction and fusion of RGB and depth information achieving steadily semantic segmentation. The discriminatory mask guides the cross-entropy loss function and interprets the influence of different pixels on back-propagation, which reduces the harmful effects of the depth noise or the fallible annotation at the edges of objects. After the correspondences between frames are provided, these semantic frames are fused in unified 3D coordinates using the novel label-oriented voxelgrid filter. It can ensure the intra-frame spatial continuity and the inter-frame spatiotemporal consistency through introducing the label-oriented statistical principle into labeled point clouds. In order to avoid the unfavorable interference between uncorrelated frames, we further propose an adaptive grouping algorithm by applying the view frustum filter to group frames with sufficient overlap as a segment. To this end, we demonstrate the effectiveness of the proposed method on the 2D/3D semantic label benchmark of ScanNetv2 and Cityscapes datasets.
Databáze: OpenAIRE