Popis: |
Abstract Optimizing the structure of deep neural networks is essential in many applications. Especially in the object detection tasks of Unmanned Aerial Vehicles. Due to the constraints of the onboard platform, a more efficient network is required to meet practical demands. Nevertheless, existing lightweight detection networks exhibit excessive redundant computations and may yield in a certain level of accuracy loss. To address these issues, this paper proposes a new lightweight network structure named Cross-Stage Partially Deformable Network (CSPDNet). The initial proposal consists of a Deformable Separable Convolution Block (DSCBlock), separating feature channels, greatly reducing the computational load of convolution, and applying adaptive sampling to the separated feature map. Subsequently, to establish information interaction between feature layers, a channel weighting module is proposed. This module calculates weights for the separated feature map, facilitating information exchange across channels and resolutions. Moreover, it compensates for the effect of point-wise (1 $$\times$$ × 1) convolutions, filtering out more important feature information. Furthermore, a new CSPDBlock is designed, primarily composed of DSCBlock, establishing multidimensional feature correlations for each separated feature layer. This approach improves the ability to capture critical feature information and reconstruct gradient paths, thereby preserving detection accuracy. The proposed technology achieves a balance between model parameter size and detection accuracy. Furthermore, experimental results on object detection datasets demonstrate that our designed network, using fewer parameters, achieves competitive detection performance results compared to existing lightweight networks YOLOv5n, YOLOv6n, YOLOv8n, NanoDet and PP-PicoDet. The optimization effect of the designed CSPDBlock, using the VisDrone dataset, is validated when incorporated into advanced detection algorithms YOLOv5m, PPYOLOEm, YOLOv7, RTMDetm and YOLOv8m. In more detail, by incorporating the designed modules it was achieved that the parameters were reduced by 10–20% while almost maintaining detection accuracy. |