De-fine: Decomposing and Refining Visual Programs with Auto-Feedback

Autor:	Gao, Minghe, Li, Juncheng, Fei, Hao, Pang, Liang, Ji, Wei, Wang, Guoming, Lv, Zheqi, Zhang, Wenqiao, Tang, Siliang, Zhuang, Yueting
Rok vydání:	2023
Předmět:	Computer Science - Computer Vision and Pattern Recognition
Druh dokumentu:	Working Paper
Popis:	Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks. Unlike end-to-end models that need task-specific data, it advances in performing visual processing and reasoning in an unsupervised manner. Current visual programming methods generate programs in a single pass for each task where the ability to evaluate and optimize based on feedback, unfortunately, is lacking, which consequentially limits their effectiveness for complex, multi-step problems. Drawing inspiration from benders decomposition, we introduce De-fine, a training-free framework that automatically decomposes complex tasks into simpler subtasks and refines programs through auto-feedback. This model-agnostic approach can improve logical reasoning performance by integrating the strengths of multiple models. Our experiments across various visual tasks show that De-fine creates more robust programs. Moreover, viewing each feedback module as an independent agent will yield fresh prospects for the field of agent research.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2311.12890 Zobrazit plný text záznamu View this record from Arxiv