Position information encoding FPN for small object detection in aerial images.

Autor: Feng, Dapeng, Zhuang, Xuebin, Chen, Zhiqiang, Zhong, Shipeng, Qi, Yuhua, Chen, Hongbo, Ma, Hongjun
Předmět:
Zdroj: Neural Computing & Applications; Sep2024, Vol. 36 Issue 26, p16023-16035, 13p
Abstrakt: Small object detection in aerial images is a challenge in remote sensing. Recently, convolutional neural networks (CNNs) have succeeded by learning localized filters that embed relative spatial information but fail to detect the small objects in the aerial images for the uneven padding. Even though the padding mechanism in CNNs allows for the capture of absolute position information and ensures consistent input–output resolution, it leads to a diminishing extraction of absolute position context from the edges to the center. This results in asymmetry bias, which negatively impacts position-dependent visual tasks like small object detection, causing blind spots and misdetection. In this study, we uncover that complex-valued CNNs, capable of explicitly encoding absolute position information, can significantly enhance conventional object detection techniques. To accomplish this, we introduce the position information encoding feature pyramid network (PieFPN), which consists of a complex-valued encoder and a real-valued decoder for explicit position information encoding. Additionally, we present the general Gaussian normalization and Gaussian error linear unit for multi-variables, incorporating them into end-to-end training schemes. To utilize the ImageNet pre-trained weights, we merge PieF with traditional feature pyramid networks, allowing for seamless integration into existing object detection pipelines. Our complex-valued designs outperform their real-valued counterparts, achieving state-of-the-art results on the DOTA-v2.0 oriented object detection in aerial images dataset. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index