Autor: |
Grøvik E; Department of Diagnostic Physics, Oslo University Hospital, Oslo, Norway.; Department of Radiology, Stanford University, Stanford, USA.; Faculty of Health and Social Sciences, University of South-Eastern Norway, Drammen, Norway., Yi D; Department of Biomedical Data Science, Stanford University, Stanford, USA., Iv M; Department of Radiology, Stanford University, Stanford, USA., Tong E; Department of Radiology, Stanford University, Stanford, USA., Nilsen LB; Department of Diagnostic Physics, Oslo University Hospital, Oslo, Norway., Latysheva A; Department of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway., Saxhaug C; Department of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway., Jacobsen KD; Department of Oncology, Oslo University Hospital, Oslo, Norway., Helland Å; Department of Oncology, Oslo University Hospital, Oslo, Norway., Emblem KE; Department of Diagnostic Physics, Oslo University Hospital, Oslo, Norway., Rubin DL; Department of Biomedical Data Science, Stanford University, Stanford, USA., Zaharchuk G; Department of Radiology, Stanford University, Stanford, USA. gregz@stanford.edu. |
Abstrakt: |
The purpose of this study was to assess the clinical value of a deep learning (DL) model for automatic detection and segmentation of brain metastases, in which a neural network is trained on four distinct MRI sequences using an input-level dropout layer, thus simulating the scenario of missing MRI sequences by training on the full set and all possible subsets of the input data. This retrospective, multicenter study, evaluated 165 patients with brain metastases. The proposed input-level dropout (ILD) model was trained on multisequence MRI from 100 patients and validated/tested on 10/55 patients, in which the test set was missing one of the four MRI sequences used for training. The segmentation results were compared with the performance of a state-of-the-art DeepLab V3 model. The MR sequences in the training set included pre-gadolinium and post-gadolinium (Gd) T1-weighted 3D fast spin echo, post-Gd T1-weighted inversion recovery (IR) prepped fast spoiled gradient echo, and 3D fluid attenuated inversion recovery (FLAIR), whereas the test set did not include the IR prepped image-series. The ground truth segmentations were established by experienced neuroradiologists. The results were evaluated using precision, recall, Intersection over union (IoU)-score and Dice score, and receiver operating characteristics (ROC) curve statistics, while the Wilcoxon rank sum test was used to compare the performance of the two neural networks. The area under the ROC curve (AUC), averaged across all test cases, was 0.989 ± 0.029 for the ILD-model and 0.989 ± 0.023 for the DeepLab V3 model (p = 0.62). The ILD-model showed a significantly higher Dice score (0.795 ± 0.104 vs. 0.774 ± 0.104, p = 0.017), and IoU-score (0.561 ± 0.225 vs. 0.492 ± 0.186, p < 0.001) compared to the DeepLab V3 model, and a significantly lower average false positive rate of 3.6/patient vs. 7.0/patient (p < 0.001) using a 10 mm 3 lesion-size limit. The ILD-model, trained on all possible combinations of four MRI sequences, may facilitate accurate detection and segmentation of brain metastases on a multicenter basis, even when the test cohort is missing input MRI sequences. |