Evaluating ChatGPT-4's performance as a digital health advisor for otosclerosis surgery.
Autor: | Sahin S; Private Practitioner, Istanbul, Türkiye., Erkmen B; Private Practitioner, Istanbul, Türkiye., Duymaz YK; Umraniye Research and Training Hospital, University of Health Sciences, Istanbul, Türkiye., Bayram F; Umraniye Research and Training Hospital, University of Health Sciences, Istanbul, Türkiye., Tekin AM; Department of Otolaryngology and Head & Neck Surgery, Vrije Universiteit Brussel, Brussels Health Care Center, Brussels, Belgium., Topsakal V; Department of Otolaryngology and Head & Neck Surgery, Vrije Universiteit Brussel, Brussels Health Care Center, Brussels, Belgium. |
---|---|
Jazyk: | angličtina |
Zdroj: | Frontiers in surgery [Front Surg] 2024 Jun 05; Vol. 11, pp. 1373843. Date of Electronic Publication: 2024 Jun 05 (Print Publication: 2024). |
DOI: | 10.3389/fsurg.2024.1373843 |
Abstrakt: | Purpose: This study aims to evaluate the effectiveness of ChatGPT-4, an artificial intelligence (AI) chatbot, in providing accurate and comprehensible information to patients regarding otosclerosis surgery. Methods: On October 20, 2023, 15 hypothetical questions were posed to ChatGPT-4 to simulate physician-patient interactions about otosclerosis surgery. Responses were evaluated by three independent ENT specialists using the DISCERN scoring system. The readability was evaluated using multiple indices: Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index (Gunning FOG), Simple Measure of Gobbledygook (SMOG), Coleman-Liau Index (CLI), and Automated Readability Index (ARI). Results: The responses from ChatGPT-4 received DISCERN scores ranging from poor to excellent, with an overall score of 50.7 ± 8.2. The readability analysis indicated that the texts were above the 6th-grade level, suggesting they may not be easily comprehensible to the average reader. There was a significant positive correlation between the referees' scores. Despite providing correct information in over 90% of the cases, the study highlights concerns regarding the potential for incomplete or misleading answers and the high readability level of the responses. Conclusion: While ChatGPT-4 shows potential in delivering health information accurately, its utility is limited by the level of readability of its responses. The study underscores the need for continuous improvement in AI systems to ensure the delivery of information that is both accurate and accessible to patients with varying levels of health literacy. Healthcare professionals should supervise the use of such technologies to enhance patient education and care. Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. (© 2024 Sahin, Erkmen, Duymaz, Bayram, Tekin and Topsakal.) |
Databáze: | MEDLINE |
Externí odkaz: |