CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models

Autor: Bharti, Shubham, Cheng, Shiyun, Rho, Jihyun, Rao, Martina, Zhu, Xiaojin
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: We introduce CHARTOM, a visual theory-of-mind benchmark for multimodal large language models. CHARTOM consists of specially designed data visualizing charts. Given a chart, a language model needs to not only correctly comprehend the chart (the FACT question) but also judge if the chart will be misleading to a human reader (the MIND question). Both questions have significant societal benefits. We detail the construction of the CHARTOM benchmark including its calibration on human performance.
Databáze: arXiv