Performance of ChatGPT in Board Examinations for Specialists in the Japanese Ophthalmology Society.

Autor: Sakai D; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN.; Department of Ophthalmology, Kobe City Medical Center General Hospital, Kobe, JPN.; Department of Surgery, Division of Ophthalmology, Kobe University Graduate School of Medicine, Kobe, JPN., Maeda T; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN., Ozaki A; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN.; Department of Ophthalmology, Mie University Graduate School of Medicine, Tsu, JPN., Kanda GN; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN.; Laboratory for Biologically Inspired Computing, RIKEN Center for Biosystems Dynamics Research, Kobe, JPN., Kurimoto Y; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN.; Department of Ophthalmology, Kobe City Medical Center General Hospital, Kobe, JPN., Takahashi M; Department of Ophthalmology, Kobe City Eye Hospital, Kobe, JPN.
Jazyk: angličtina
Zdroj: Cureus [Cureus] 2023 Dec 04; Vol. 15 (12), pp. e49903. Date of Electronic Publication: 2023 Dec 04 (Print Publication: 2023).
DOI: 10.7759/cureus.49903
Abstrakt: We investigated the potential of ChatGPT in the ophthalmological field in the Japanese language using board examinations for specialists in the Japanese Ophthalmology Society. We tested GPT-3.5 and GPT-4-based ChatGPT on five sets of past board examination problems in July 2023. Japanese text was used as the prompt adopting two strategies: zero- and few-shot prompting. We compared the correct answer rate of ChatGPT with that of actual examinees, and the performance characteristics in 10 subspecialties were assessed. ChatGPT-3.5 and ChatGPT-4 correctly answered 112 (22.4%) and 229 (45.8%) out of 500 questions with simple zero-shot prompting, respectively, and ChatGPT-4 correctly answered 231 (46.2%) questions with few-shot prompting. The correct answer rates of ChatGPT-3.5 were approximately two to three times lower than those of the actual examinees for each examination set (p = 0.001). However, the correct answer rates for ChatGPT-4 were close to approximately 70% of those of the examinees. ChatGPT-4 had the highest correct answer rate (71.4% with zero-shot prompting and 61.9% with few-shot prompting) in "blepharoplasty, orbit, and ocular oncology," and the lowest answer rate (30.0% with zero-shot prompting and 23.3% with few-shot prompting) in "pediatric ophthalmology." We concluded that ChatGPT could be one of the advanced technologies for practical tools in Japanese ophthalmology.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright © 2023, Sakai et al.)
Databáze: MEDLINE