Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases With Atypical Presentation: Descriptive Research.
Autor: | Shikino K; Department of General Medicine, Chiba University Hospital, Chiba, Japan.; Department of Community-Oriented Medical Education, Chiba University Graduate School of Medicine, Chiba, Japan., Shimizu T; Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan., Otsuka Y; Department of General Medicine, Dentistry and Pharmaceutical Sciences, Okayama University Graduate School of Medicine, Okayama, Japan., Tago M; Department of General Medicine, Saga University Hospital, Saga, Japan., Takahashi H; Department of General Medicine, Juntendo University Hospital Faculty of Medicine, Tokyo, Japan., Watari T; Integrated Clinical Education Center Hospital Integrated Clinical Education, Kyoto University Hospital, Kyoto, Japan., Sasaki Y; Department of General Medicine and Emergency Care, Toho University School of Medicine, Tokyo, Japan., Iizuka G; Center for Preventive Medical Sciences, Chiba University, Chiba, Japan.; Tama Family Clinic, Kanagawa, Japan., Tamura H; Department of General Medicine, Chiba University Hospital, Chiba, Japan., Nakashima K; Department of General Medicine, Awa Regional Medical Center, Chiba, Japan., Kunitomo K; Department of General Medicine, National Hospital Organization Kumamoto Medical Center, Kumamoto, Japan., Suzuki M; Department of General Medicine, National Hospital Organization Kumamoto Medical Center, Kumamoto, Japan.; Department of Neurology, University of Utah, Salt Lake City, UT, United States., Aoyama S; Department of Internal Medicine, Mito Kyodo General Hospital, Ibaraki, Japan., Kosaka S; Tokyo Metropolitan Hiroo Hospital, Tokyo, Japan., Kawahigashi T; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, United States., Matsumoto T; Division of General Medicine, Nerima Hikarigaoka Hospital, Tokyo, Japan., Orihara F; Division of General Medicine, Nerima Hikarigaoka Hospital, Tokyo, Japan., Morikawa T; Department of General Medicine, Nara City Hospital, Nara, Japan., Nishizawa T; Department of General Internal Medicine, St. Luke's International Hospital, Tokyo, Japan., Hoshina Y; Department of Neurology, University of Utah, Salt Lake City, UT, United States., Yamamoto Y; Division of General Medicine, Center for Community Medicine, Jichi Medical University, Tochigi, Japan., Matsuo Y; Department of Clinical Epidemiology and Health Economics, The Graduate School of Medicine, The University of Tokyo, Tokyo, Japan., Unoki Y; Department of General Internal Medicine, Iizuka Hospital, Fukuoka, Japan., Kimura H; Department of General Internal Medicine, Iizuka Hospital, Fukuoka, Japan., Tokushima M; Saga Medical Career Support Center, Saga University Hospital, Saga, Japan., Watanuki S; Department of Emergency and General Medicine, Tokyo Metropolitan Tama Medical Center, Tokyo, Japan., Saito T; Department of Emergency and General Medicine, Tokyo Metropolitan Tama Medical Center, Tokyo, Japan., Otsuka F; Department of General Medicine, Dentistry and Pharmaceutical Sciences, Okayama University Graduate School of Medicine, Okayama, Japan., Tokuda Y; Muribushi Okinawa Center for Teaching Hospitals, Okinawa, Japan.; Tokyo Foundation for Policy Research, Tokyo, Japan. |
---|---|
Jazyk: | angličtina |
Zdroj: | JMIR medical education [JMIR Med Educ] 2024 Jun 21; Vol. 10, pp. e58758. Date of Electronic Publication: 2024 Jun 21. |
DOI: | 10.2196/58758 |
Abstrakt: | Background: The persistence of diagnostic errors, despite advances in medical knowledge and diagnostics, highlights the importance of understanding atypical disease presentations and their contribution to mortality and morbidity. Artificial intelligence (AI), particularly generative pre-trained transformers like GPT-4, holds promise for improving diagnostic accuracy, but requires further exploration in handling atypical presentations. Objective: This study aimed to assess the diagnostic accuracy of ChatGPT in generating differential diagnoses for atypical presentations of common diseases, with a focus on the model's reliance on patient history during the diagnostic process. Methods: We used 25 clinical vignettes from the Journal of Generalist Medicine characterizing atypical manifestations of common diseases. Two general medicine physicians categorized the cases based on atypicality. ChatGPT was then used to generate differential diagnoses based on the clinical information provided. The concordance between AI-generated and final diagnoses was measured, with a focus on the top-ranked disease (top 1) and the top 5 differential diagnoses (top 5). Results: ChatGPT's diagnostic accuracy decreased with an increase in atypical presentation. For category 1 (C1) cases, the concordance rates were 17% (n=1) for the top 1 and 67% (n=4) for the top 5. Categories 3 (C3) and 4 (C4) showed a 0% concordance for top 1 and markedly lower rates for the top 5, indicating difficulties in handling highly atypical cases. The χ2 test revealed no significant difference in the top 1 differential diagnosis accuracy between less atypical (C1+C2) and more atypical (C3+C4) groups (χ²1=2.07; n=25; P=.13). However, a significant difference was found in the top 5 analyses, with less atypical cases showing higher accuracy (χ²1=4.01; n=25; P=.048). Conclusions: ChatGPT-4 demonstrates potential as an auxiliary tool for diagnosing typical and mildly atypical presentations of common diseases. However, its performance declines with greater atypicality. The study findings underscore the need for AI systems to encompass a broader range of linguistic capabilities, cultural understanding, and diverse clinical scenarios to improve diagnostic utility in real-world settings. (© Kiyoshi Shikino, Taro Shimizu, Yuki Otsuka, Masaki Tago, Hiromizu Takahashi, Takashi Watari, Yosuke Sasaki, Gemmei Iizuka, Hiroki Tamura, Koichi Nakashima, Kotaro Kunitomo, Morika Suzuki, Sayaka Aoyama, Shintaro Kosaka, Teiko Kawahigashi, Tomohiro Matsumoto, Fumina Orihara, Toru Morikawa, Toshinori Nishizawa, Yoji Hoshina, Yu Yamamoto, Yuichiro Matsuo, Yuto Unoki, Hirofumi Kimura, Midori Tokushima, Satoshi Watanuki, Takuma Saito, Fumio Otsuka, Yasuharu Tokuda. Originally published in JMIR Medical Education (https://mededu.jmir.org).) |
Databáze: | MEDLINE |
Externí odkaz: |