Výsledky vyhledávání - "Khani, Aliasgahr"

Report

MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs

Autor: Taghanaki, Saeid Asgari, Khani, Aliasgahr, Khasahmadi, Amir

Existing benchmarks for large language models (LLMs) increasingly struggle to differentiate between top-performing models, underscoring the need for more challenging evaluation frameworks. We introduce MMLU-Pro+, an enhanced benchmark building upon M

Externí odkaz: http://arxiv.org/abs/2409.02257

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání