ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam.

Autor:	Kıyak, Yavuz Selim, Coşkun, Özlem, Budakoğlu, Işıl İrem, Uluoğlu, Canan
Předmět:	MEDICAL education ARTIFICIAL intelligence DRUG therapy HYPERTENSION RESEARCH methodology evaluation PILOT projects PROFESSIONAL licensure examinations EDUCATIONAL tests & measurements CERTIFICATION MEDICAL students PSYCHOMETRICS CLINICAL competence MEDICAL schools DISCRIMINATION (Sociology) AUTOMATION COMPUTER assisted testing (Education) WRITTEN communication
Zdroj:	European Journal of Clinical Pharmacology; May2024, Vol. 80 Issue 5, p729-735, 7p
Abstrakt:	Purpose: Artificial intelligence, specifically large language models such as ChatGPT, offers valuable potential benefits in question (item) writing. This study aimed to determine the feasibility of generating case-based multiple-choice questions using ChatGPT in terms of item difficulty and discrimination levels. Methods: This study involved 99 fourth-year medical students who participated in a rational pharmacotherapy clerkship carried out based-on the WHO 6-Step Model. In response to a prompt that we provided, ChatGPT generated ten case-based multiple-choice questions on hypertension. Following an expert panel, two of these multiple-choice questions were incorporated into a medical school exam without making any changes in the questions. Based on the administration of the test, we evaluated their psychometric properties, including item difficulty, item discrimination (point-biserial correlation), and functionality of the options. Results: Both questions exhibited acceptable levels of point-biserial correlation, which is higher than the threshold of 0.30 (0.41 and 0.39). However, one question had three non-functional options (options chosen by fewer than 5% of the exam participants) while the other question had none. Conclusions: The findings showed that the questions can effectively differentiate between students who perform at high and low levels, which also point out the potential of ChatGPT as an artificial intelligence tool in test development. Future studies may use the prompt to generate items in order for enhancing the external validity of the results by gathering data from diverse institutions and settings. [ABSTRACT FROM AUTHOR]
Databáze:	Complementary Index
Externí odkaz:	Zobrazit plný text záznamu Full text from SpringerLink