Towards real-time accurate dense pedestrian detection via large-kernel perception module and multi-level feature fusion.

Autor: Li, Huajie, Zhang, Sulan, Hu, Lihua, Zhou, Huiyuan
Zdroj: Journal of Real-Time Image Processing; Jan2025, Vol. 22 Issue 1, p1-14, 14p
Abstrakt: In the domain of computer vision, dense pedestrian detection remains a challenging task. Existing one-stage detectors in dense scenes often suffer from difficulties in feature extraction and inaccuracies in localizing small pedestrian targets due to severe occlusions and significant scale variations. To resolve these issues, we propose a new dense pedestrian detection algorithm named LEI-YOLO. First, we embed the proposed Large Kernel Perception (LKP) module into the backbone network to capture global information of targets in occluded scenes, extracting more comprehensive feature representations. Second, an enhanced Path Aggregation Network (E-PANet) is designed in the neck network to perform multi-level feature fusion by capturing more shallow target feature information and fully integrating it with deep feature information, effectively reducing small pedestrian misses. Finally, we propose a dynamically focused Powerful Intersection over Union (PIoU) loss function combined with auxiliary bounding boxes (Inner-PIoUv2) to improve the original loss function, leading to faster convergence and more accurate regression results. Experimental results show that on the CrowdHuman dataset, the proposed method improves the Average Precision (AP) by 4.2 % compared to the baseline model and reduces the log-average Miss Rate M R - 2 by 4.8 % . Moreover, our method also achieves significant results on the WiderPerson dataset, further validating the model’s effectiveness and generalization capability, achieving an effective balance between accuracy and real-time performance in dense pedestrian detection scenarios. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index