Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Islam, Sami Nur"'
Pre-trained Vision-Language Models (VLMs) are able to understand visual concepts, describe and decompose complex tasks into sub-tasks, and provide feedback on task completion. In this paper, we aim to leverage these capabilities to support the traini
Externí odkaz:
http://arxiv.org/abs/2402.04764