Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN

Autor: Yuanfeng Lian, Yinliang Shi, Zhaonian Liu, Bin Jiang, Xingtao Li
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IET Image Processing, Vol 18, Iss 11, Pp 2899-2917 (2024)
Druh dokumentu: article
ISSN: 1751-9667
1751-9659
DOI: 10.1049/ipr2.13142
Popis: Abstract Head pose estimation is an especially challenging task due to the complexity nonlinear mapping from 2D feature space to 3D pose space. To address the above issue, this paper presents a novel and efficient head pose estimation framework based on particle swarm optimized contrastive learning and multimodal entangled graph convolution network. Firstly, a new network, the region and difference‐aware feature pyramid network (RD‐FPN), is proposed for 2D keypoints detection to alleviate the background interference and enhance the feature expressiveness. Then, particle swarm optimized contrastive learning is constructed to alternatively match 2D and 3D keypoints, which takes the multimodal keypoints matching accuracy as the optimization objective, while considering the similarity of cross‐modal positive and negative sample pairs from contrastive learning as a local contrastive constraint. Finally, multimodal entangled graph convolution network is designed to enhance the ability of establishing geometric relationships between keypoints and head pose angles based on second‐order bilinear attention, in which point‐edge attention is introduced to improve the representation of geometric features between multimodal keypoints. Compared with other methods, the average error of our method is reduced by 8.23%, indicating the accuracy, generalization, and efficiency of our method on the 300W‐LP, AFLW2000, BIWI datasets.
Databáze: Directory of Open Access Journals