Autor: |
Tanno, Ryutaro, Barrett, David G. T., Sellergren, Andrew, Ghaisas, Sumedh, Dathathri, Sumanth, See, Abigail, Welbl, Johannes, Lau, Charles, Tu, Tao, Azizi, Shekoofeh, Singhal, Karan, Schaekermann, Mike, May, Rhys, Lee, Roy, Man, SiWai, Mahdavi, Sara, Ahmed, Zahra, Matias, Yossi, Barral, Joelle, Eslami, S. M. Ali, Belgrave, Danielle, Liu, Yun, Kalidindi, Sreenivasa Raju, Shetty, Shravya, Natarajan, Vivek, Kohli, Pushmeet, Huang, Po-Sen, Karthikesalingam, Alan, Ktena, Ira |
Zdroj: |
Nature Medicine; 20240101, Issue: Preprints p1-10, 10p |
Abstrakt: |
Automated radiology report generation has the potential to improve patient care and reduce the workload of radiologists. However, the path toward real-world adoption has been stymied by the challenge of evaluating the clinical quality of artificial intelligence (AI)-generated reports. We build a state-of-the-art report generation system for chest radiographs, called Flamingo-CXR, and perform an expert evaluation of AI-generated reports by engaging a panel of board-certified radiologists. We observe a wide distribution of preferences across the panel and across clinical settings, with 56.1% of Flamingo-CXR intensive care reports evaluated to be preferable or equivalent to clinician reports, by half or more of the panel, rising to 77.7% for in/outpatient X-rays overall and to 94% for the subset of cases with no pertinent abnormal findings. Errors were observed in human-written reports and Flamingo-CXR reports, with 24.8% of in/outpatient cases containing clinically significant errors in both report types, 22.8% in Flamingo-CXR reports only and 14.0% in human reports only. For reports that contain errors we develop an assistive setting, a demonstration of clinician–AI collaboration for radiology report composition, indicating new possibilities for potential clinical utility. |
Databáze: |
Supplemental Index |
Externí odkaz: |
|