Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.

Autor: Varady NH; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA., Lu AZ; Weill Cornell Medical College, New York, New York, USA., Mazzucco M; Weill Cornell Medical College, New York, New York, USA., Dines JS; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA., Altchek DW; Weill Cornell Medical College, New York, New York, USA., Williams RJ 3rd; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA., Kunze KN; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA.
Jazyk: angličtina
Zdroj: Orthopaedic journal of sports medicine [Orthop J Sports Med] 2024 Jul 31; Vol. 12 (7), pp. 23259671241257516. Date of Electronic Publication: 2024 Jul 31 (Print Publication: 2024).
DOI: 10.1177/23259671241257516
Abstrakt: Background: The consumer availability and automated response functions of chat generator pretrained transformer (ChatGPT-4), a large language model, poise this application to be utilized for patient health queries and may have a role in serving as an adjunct to minimize administrative and clinical burden.
Purpose: To evaluate the ability of ChatGPT-4 to respond to patient inquiries concerning ulnar collateral ligament (UCL) injuries and compare these results with the performance of Google.
Study Design: Cross-sectional study.
Methods: Google Web Search was used as a benchmark, as it is the most widely used search engine worldwide and the only search engine that generates frequently asked questions (FAQs) when prompted with a query, allowing comparisons through a systematic approach. The query "ulnar collateral ligament reconstruction" was entered into Google, and the top 10 FAQs, answers, and their sources were recorded. ChatGPT-4 was prompted to perform a Google search of FAQs with the same query and to record the sources of answers for comparison. This process was again replicated to obtain 10 new questions requiring numeric instead of open-ended responses. Finally, responses were graded independently for clinical accuracy (grade 0 = inaccurate, grade 1 = somewhat accurate, grade 2 = accurate) by 2 fellowship-trained sports medicine surgeons (D.W.A, J.S.D.) blinded to the search engine and answer source.
Results: ChatGPT-4 used a greater proportion of academic sources than Google to provide answers to the top 10 FAQs, although this was not statistically significant (90% vs 50%; P = .14). In terms of question overlap, 40% of the most common questions on Google and ChatGPT-4 were the same. When comparing FAQs with numeric responses, 20% of answers were completely overlapping, 30% demonstrated partial overlap, and the remaining 50% did not demonstrate any overlap. All sources used by ChatGPT-4 to answer these FAQs were academic, while only 20% of sources used by Google were academic ( P = .0007). The remaining Google sources included social media (40%), medical practices (20%), single-surgeon websites (10%), and commercial websites (10%). The mean (± standard deviation) accuracy for answers given by ChatGPT-4 was significantly greater compared with Google for the top 10 FAQs (1.9 ± 0.2 vs 1.2 ± 0.6; P = .001) and top 10 questions with numeric answers (1.8 ± 0.4 vs 1 ± 0.8; P = .013).
Conclusion: ChatGPT-4 is capable of providing responses with clinically relevant content concerning UCL injuries and reconstruction. ChatGPT-4 utilized a greater proportion of academic websites to provide responses to FAQs representative of patient inquiries compared with Google Web Search and provided significantly more accurate answers. Moving forward, ChatGPT has the potential to be used as a clinical adjunct when answering queries about UCL injuries and reconstruction, but further validation is warranted before integrated or autonomous use in clinical settings.
Competing Interests: One or more of the authors has declared the following potential conflict of interest or source of funding: J.S.D. has received consulting fees from Arthrex and Trice Medical; royalties or license payments from Linvatec; nonconsulting fees from Arthrex; education payments from Gotham Surgical Solutions & Devices; and hospitality payments from Horizon Pharma. D.W.A. has received royalties or license payments from Stryker and hospitality payments from Arthrex. R.J.W. has received royalties or license payments from Arthrex; acquisitions from Smith+Nephew; nonconsulting fees from Arthrex; consulting fees from Arthrex; and hospitality payments from Joint Restoration Foundation. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
(© The Author(s) 2024.)
Databáze: MEDLINE