Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Khullar, Dipika"'
Through a simple multiple choice language prompt a VQA model can operate as a zero-shot image classifier, producing a classification label. Compared to typical image encoders, VQA models offer an advantage: VQA-produced image embeddings can be infuse
Externí odkaz:
http://arxiv.org/abs/2407.16145