Popis: |
In the field of Visual Saliency Detection, accurately segmenting salient objects from images is crucial for various applications such as image editing and visual tracking. However, this task becomes particularly challenging due to several issues encountered across different scenes, including: 1) high diversity of salient objects in terms of size, color, and texture; and 2) unclear boundaries between salient objects and their backgrounds. To address these challenges, this paper introduces a deep convolutional network based on global guidance map and background attention, named GMBA-Net, for precise segmentation of salient objects. Initially, the Cross-Modal Multi-Scale Attention Module (CMAM) finely integrates information from RGB and depth images. This module employs attention mechanisms to enhance the complementarity between modalities and the expressiveness of multi-scale features, thereby capturing saliency information more effectively at different scales. Following this, the Interactive Global Context Module (IGCM) is utilized to generate a global guidance map, which leverages semantic information from high-level features for accurate localization of salient regions. Finally, a background attention mechanism (BGAM) refines the boundaries of salient objects, significantly improving the clarity of the edges in the saliency maps and achieving precise delineation of salient object boundaries. Experiments conducted on multiple public RGB-D salient object detection datasets have validated the effectiveness of our proposed method. Compared to existing state-of-the-art methods, our model demonstrates significant improvements in terms of salient object detection accuracy and boundary clarity, substantially enhancing the overall performance of salient object detection. |