3D Point Cloud Object Tracking Based on Multi-level Fusion of Transformer Features

Autor: LI Zhijie, LIANG Bowen, DING Xinmiao, GUO Wen
Jazyk: čínština
Rok vydání: 2024
Předmět:
Zdroj: Jisuanji kexue yu tansuo, Vol 18, Iss 11, Pp 3006-3014 (2024)
Druh dokumentu: article
ISSN: 1673-9418
DOI: 10.3778/j.issn.1673-9418.2401071
Popis: During the 3D point cloud object tracking, some issues such as occlusion, sparsity, and random noise often arise. To address these challenges, this paper proposes a novel approach to 3D point cloud object tracking based on multi-level fusion of Transformer features. The method mainly consists of the point attention embedding module and the point attention enhancement module, which are used for feature extraction and feature matching processes, respectively. Firstly, by embedding two attention mechanisms into each other to form the point attention embedding module and fusing it with the relationship-aware sampling method proposed by PTTR (point relation transformer for tracking), the purpose of fully extracting features is achieved. Subsequently, the feature information is input into the point attention enhancement module, and through cross-attention, features from different levels are matched sequentially to achieve the goal of deep fusion of global and local features. Moreover, to obtain discriminative feature fusion maps, a residual network is employed to connect the fusion results from different layers. Finally, the feature fusion map is input into the target prediction module to achieve precise prediction of the final 3D target object. Experimental validation on KITTI, nuScenes, and Waymo datasets demonstrates the effectiveness of the proposed method. Excluding few-shot data, the proposed method achieves an average improvement of 1.4 percentage points in success and 1.4 percentage points in precision in terms of object tracking.
Databáze: Directory of Open Access Journals