Autor: |
Batra K; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390., Xi Y; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390., Al-Hreish KM; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390., Kay FU; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390., Browning T; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390., Baker C; Health Systems Information Resources, University of Texas Southwestern Health Systems, Dallas, TX., Peshock RM; Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390.; Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX. |
Abstrakt: |
BACKGROUND. Artificial intelligence (AI) algorithms have shown strong performance for detection of pulmonary embolism (PE) on CT examinations performed using a dedicated protocol for PE detection. AI performance is less well studied for detecting PE on examinations ordered for reasons other than suspected PE (i.e., incidental PE [iPE]). OBJECTIVE. The purpose of this study was to assess the diagnostic performance of an AI algorithm for detection of iPE on conventional contrast-enhanced chest CT examinations. METHODS. This retrospective study included 2555 patients (mean age, 53.2 ± 14.5 [SD] years; 1340 women, 1215 men) who underwent 3003 conventional contrast-enhanced chest CT examinations (i.e., not using pulmonary CTA protocols) between September 2019 and February 2020. A commercial AI algorithm was applied to the images to detect acute iPE. A vendor-supplied natural language processing (NLP) algorithm was applied to the clinical reports to identify examinations interpreted as positive for iPE. For all examinations that were positive by the AI-based image review or by NLP-based report review, a multireader adjudication process was implemented to establish a reference standard for iPE. Images were also reviewed to identify explanations of AI misclassifications. RESULTS. On the basis of the adjudication process, the frequency of iPE was 1.3% (40/3003). AI detected four iPEs missed by clinical reports, and clinical reports detected seven iPEs missed by AI. AI, compared with clinical reports, exhibited significantly lower PPV (86.8% vs 97.3%, p = .03) and specificity (99.8% vs 100.0%, p = .045). Differences in sensitivity (82.5% vs 90.0%, p = .37) and NPV (99.8% vs 99.9%, p = .36) were not significant. For AI, neither sensitivity nor specificity varied significantly in association with age, sex, patient status, or cancer-related clinical scenario (all p > .05). Explanations of false-positives by AI included metastatic lymph nodes and pulmonary venous filling defect, and explanations of false-negatives by AI included surgically altered anatomy and small-caliber subsegmental vessels. CONCLUSION. AI had high NPV and moderate PPV for iPE detection, detecting some iPEs missed by radiologists. CLINICAL IMPACT. Potential applications of the AI tool include serving as a second reader to help detect additional iPEs or as a worklist triage tool to allow earlier iPE detection and intervention. Various explanations of AI misclassifications may provide targets for model improvement. |