Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Hang, Jiangnan"'
The natural language understanding (NLU) performance of large language models (LLMs) has been evaluated across various tasks and datasets. The existing evaluation methods, however, do not take into account the variance in scores due to differences in
Externí odkaz:
http://arxiv.org/abs/2408.12263