Universal Relocalizer for Weakly Supervised Referring Expression Grounding.

Autor: Zhang, Panpan, Liu, Meng, Song, Xuemeng, Cao, Da, Gao, Zan, Nie, Liqiang
Předmět:
Zdroj: ACM Transactions on Multimedia Computing, Communications & Applications; Jul2024, Vol. 20 Issue 7, p1-23, 23p
Abstrakt: This article introduces the Universal Relocalizer, a novel approach designed for weakly supervised referring expression grounding. Our method strives to pinpoint a target proposal that corresponds to a specific query, eliminating the need for region-level annotations during training. To bolster the localization precision and enrich the semantic understanding of the target proposal, we devise three key modules: the category module, the color module, and the spatial relationship module. The category and color modules assign respective category and color labels to region proposals, enabling the computation of category and color scores. Simultaneously, the spatial relationship module integrates spatial cues, yielding a spatial score for each proposal to enhance localization accuracy further. By adeptly amalgamating the category, color, and spatial scores, we derive a refined grounding score for every proposal. Comprehensive evaluations on the RefCOCO, RefCOCO+, and RefCOCOg datasets manifest the prowess of the Universal Relocalizer, showcasing its formidable performance across the board. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index