Popis: |
To realize real-time road scene understanding, jointly detecting objects and segmenting road areas, an improved multi-task network was proposed. Based on SE-ResNeXt, dilated convolution was added to expand the image receptive field and improve the performance of the encoder. In terms of object detection, a coarse-fine optimization network was proposed, using high-level features to further refine the low-level rough estimated results, and the self-attention module was used to adaptively adjust the detection results of different scales from a global perspective view. For road detection, a pyramid pooling model was added to obtain global information, and a jump connection mode was used to combine multi-level features. A channel adjustment module was added to adjust the relationship between different channels. Experiments show that these strategies can significantly improve the detection results while increasing a small amount of reasoning time. And generalization experiment proves the effectiveness of the method. Two tasks were trained together, resulting in mutual promotion. |