Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Zahraei, Pardis Sadat"'
Autor:
Zahraei, Pardis Sadat, Shakeri, Zahra
Biased AI-generated medical advice and misdiagnoses can jeopardize patient safety, making the integrity of AI in healthcare more critical than ever. As Large Language Models (LLMs) take on a growing role in medical decision-making, addressing their b
Externí odkaz:
http://arxiv.org/abs/2410.06566
We present TuringQ, the first benchmark designed to evaluate the reasoning capabilities of large language models (LLMs) in the theory of computation. TuringQ consists of 4,006 undergraduate and graduate-level question-answer pairs, categorized into f
Externí odkaz:
http://arxiv.org/abs/2410.06547
Autor:
Zahraei, Pardis Sadat, Emami, Ali
The Winograd Schema Challenge (WSC) serves as a prominent benchmark for evaluating machine understanding. While Large Language Models (LLMs) excel at answering WSC questions, their ability to generate such questions remains less explored. In this wor
Externí odkaz:
http://arxiv.org/abs/2401.17703