C-YOSO: Contrastive Query on Real-Time Panoptic Segmentation

Autor:	Chananvich Plabplathong, Kultida Rojviboonchai, Peerapon Vateekul
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Autonomous driving cityscapes deep learning real-time panoptic segmentation Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 12, Pp 177355-177367 (2024)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2024.3506558
Popis:	Panoptic segmentation, combining instance and semantic segmentation, provides comprehensive image understanding for various tasks. Achieving real-time performance with high accuracy is challenging. Recent panoptic segmentation models operate in real-time but frequently deal with low accuracy in comparison to existing benchmarks. In this paper, we aim to enhance the performance of the model “You Only Segment Once” (YOSO), the fastest panoptic segmentation model. Our model, C-YOSO, enhances YOSO by incorporating a contrastive query decoder module with two core components. First, the textual-guided query utilizes a contrastive loss between object queries and textual ground truth to boost accuracy. Second, the lightweight query decoder accelerates inference speed by leveraging global average pooling (GAP) and $1\times 1$ convolutions. The experiment is conducted on the Cityscapes dataset, comparing C-YOSO (ours) and YOSO. Results signify improved accuracy from 59.7 to 61.8 panoptic quality (PQ) while maintaining similar inference speed from 11.1 to 11.0 frames per second (FPS). Moreover, accuracy is seen to increase in almost all classes. To make it a real-time system, we reduce the input size by half, achieving 22.3 FPS with 54.1 PQ. These results demonstrate that our model achieves the best performance in both accuracy (PQ) and speed (FPS).
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/bb94146f7dd24e948b8d092f0976070f Zobrazit plný text záznamu View record in DOAJ