Vignette-based comparative analysis of ChatGPT and specialist treatment decisions for rheumatic patients: results of the Rheum2Guide study.
Autor: | Labinsky H; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Nagler LK; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Krusche M; Division of Rheumatology and Systemic Inflammatory Diseases, III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany., Griewing S; Institute for Digital Medicine, University Hospital Giessen-Marburg, Philipps University, Baldingerstrasse, Marburg, Germany.; Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Palo Alto, CA, USA., Aries P; Department of Rheumatology, Immunologikum, Hamburg, Germany., Kroiß A; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Strunz PP; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Kuhn S; Institute for Digital Medicine, University Hospital Giessen-Marburg, Philipps University, Baldingerstrasse, Marburg, Germany., Schmalzing M; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Gernert M; Department of Internal Medicine 2, Rheumatology/Clinical Immunology, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080, Würzburg, Germany., Knitza J; Institute for Digital Medicine, University Hospital Giessen-Marburg, Philipps University, Baldingerstrasse, Marburg, Germany. knitza@uni-marburg.de.; AGEIS, Université Grenoble Alpes, Grenoble, France. knitza@uni-marburg.de. |
---|---|
Jazyk: | angličtina |
Zdroj: | Rheumatology international [Rheumatol Int] 2024 Oct; Vol. 44 (10), pp. 2043-2053. Date of Electronic Publication: 2024 Aug 10. |
DOI: | 10.1007/s00296-024-05675-5 |
Abstrakt: | Background: The complex nature of rheumatic diseases poses considerable challenges for clinicians when developing individualized treatment plans. Large language models (LLMs) such as ChatGPT could enable treatment decision support. Objective: To compare treatment plans generated by ChatGPT-3.5 and GPT-4 to those of a clinical rheumatology board (RB). Design/methods: Fictional patient vignettes were created and GPT-3.5, GPT-4, and the RB were queried to provide respective first- and second-line treatment plans with underlying justifications. Four rheumatologists from different centers, blinded to the origin of treatment plans, selected the overall preferred treatment concept and assessed treatment plans' safety, EULAR guideline adherence, medical adequacy, overall quality, justification of the treatment plans and their completeness as well as patient vignette difficulty using a 5-point Likert scale. Results: 20 fictional vignettes covering various rheumatic diseases and varying difficulty levels were assembled and a total of 160 ratings were assessed. In 68.8% (110/160) of cases, raters preferred the RB's treatment plans over those generated by GPT-4 (16.3%; 26/160) and GPT-3.5 (15.0%; 24/160). GPT-4's plans were chosen more frequently for first-line treatments compared to GPT-3.5. No significant safety differences were observed between RB and GPT-4's first-line treatment plans. Rheumatologists' plans received significantly higher ratings in guideline adherence, medical appropriateness, completeness and overall quality. Ratings did not correlate with the vignette difficulty. LLM-generated plans were notably longer and more detailed. Conclusion: GPT-4 and GPT-3.5 generated safe, high-quality treatment plans for rheumatic diseases, demonstrating promise in clinical decision support. Future research should investigate detailed standardized prompts and the impact of LLM usage on clinical decisions. (© 2024. The Author(s).) |
Databáze: | MEDLINE |
Externí odkaz: |