Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Ha, Cuong Nhat"'
Autor:
Ha, Cuong Nhat, Asaadi, Shima, Karn, Sanjeev Kumar, Farri, Oladimeji, Heimann, Tobias, Runkler, Thomas
Vision-language models, while effective in general domains and showing strong performance in diverse multi-modal applications like visual question-answering (VQA), struggle to maintain the same level of effectiveness in more specialized domains, e.g.
Externí odkaz:
http://arxiv.org/abs/2404.16192