Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Bezrukov, Oleksandr"'
Evaluating Video Language Models (VLMs) is a challenging task. Due to its transparency, Multiple-Choice Question Answering (MCQA) is widely used to measure the performance of these models through accuracy. However, existing MCQA benchmarks fail to ca
Externí odkaz:
http://arxiv.org/abs/2410.14248