Evaluating the clinical utility of artificial intelligence assistance and its explanation on the glioma grading task.

Autor: Jin W; School of Computing Science, Simon Fraser University, Burnaby, Canada. Electronic address: weinaj@sfu.ca., Fatehi M; Division of Neurosurgery, The University of British Columbia, Vancouver, Canada. Electronic address: fatehi@alumni.ubc.ca., Guo R; Division of Neurosurgery, The University of British Columbia, Vancouver, Canada. Electronic address: rucheng@student.ubc.ca., Hamarneh G; School of Computing Science, Simon Fraser University, Burnaby, Canada. Electronic address: hamarneh@sfu.ca.
Jazyk: angličtina
Zdroj: Artificial intelligence in medicine [Artif Intell Med] 2024 Feb; Vol. 148, pp. 102751. Date of Electronic Publication: 2024 Jan 02.
DOI: 10.1016/j.artmed.2023.102751
Abstrakt: Clinical evaluation evidence and model explainability are key gatekeepers to ensure the safe, accountable, and effective use of artificial intelligence (AI) in clinical settings. We conducted a clinical user-centered evaluation with 35 neurosurgeons to assess the utility of AI assistance and its explanation on the glioma grading task. Each participant read 25 brain MRI scans of patients with gliomas, and gave their judgment on the glioma grading without and with the assistance of AI prediction and explanation. The AI model was trained on the BraTS dataset with 88.0% accuracy. The AI explanation was generated using the explainable AI algorithm of SmoothGrad, which was selected from 16 algorithms based on the criterion of being truthful to the AI decision process. Results showed that compared to the average accuracy of 82.5±8.7% when physicians performed the task alone, physicians' task performance increased to 87.7±7.3% with statistical significance (p-value = 0.002) when assisted by AI prediction, and remained at almost the same level of 88.5±7.0% (p-value = 0.35) with the additional assistance of AI explanation. Based on quantitative and qualitative results, the observed improvement in physicians' task performance assisted by AI prediction was mainly because physicians' decision patterns converged to be similar to AI, as physicians only switched their decisions when disagreeing with AI. The insignificant change in physicians' performance with the additional assistance of AI explanation was because the AI explanations did not provide explicit reasons, contexts, or descriptions of clinical features to help doctors discern potentially incorrect AI predictions. The evaluation showed the clinical utility of AI to assist physicians on the glioma grading task, and identified the limitations and clinical usage gaps of existing explainable AI techniques for future improvement.
Competing Interests: Declaration of competing interest All authors, declare no financial or non-financial competing interests.
(Copyright © 2023. Published by Elsevier B.V.)
Databáze: MEDLINE