Exploration of visual prompt in Grounded pre-trained open-set detection

Autor:	Chen, Qibo, Jin, Weizhong, Li, Shuchang, Liu, Mengdi, Yu, Li, Jiang, Jian, Wang, Xiaozheng
Rok vydání:	2023
Předmět:	Computer Science - Computer Vision and Pattern Recognition
Druh dokumentu:	Working Paper
Popis:	Text prompts are crucial for generalizing pre-trained open-set object detection models to new categories. However, current methods for text prompts are limited as they require manual feedback when generalizing to new categories, which restricts their ability to model complex scenes, often leading to incorrect detection results. To address this limitation, we propose a novel visual prompt method that learns new category knowledge from a few labeled images, which generalizes the pre-trained detection model to the new category. To allow visual prompts to represent new categories adequately, we propose a statistical-based prompt construction module that is not limited by predefined vocabulary lengths, thus allowing more vectors to be used when representing categories. We further utilize the category dictionaries in the pre-training dataset to design task-specific similarity dictionaries, which make visual prompts more discriminative. We evaluate the method on the ODinW dataset and show that it outperforms existing prompt learning methods and performs more consistently in combinatorial inference. Comment: Accepted at ICASSP 2024
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2312.08839 Zobrazit plný text záznamu View this record from Arxiv