Accurate segmentation of head and neck radiotherapy CT scans with 3D CNNs: consistency is key
Autor: | Edward G A Henderson, Eliana M Vasquez Osorio, Marcel van Herk, Charlotte L Brouwer, Roel J H M Steenbakkers, Andrew F Green |
---|---|
Přispěvatelé: | Guided Treatment in Optimal Selected Cancer Patients (GUTS) |
Jazyk: | angličtina |
Rok vydání: | 2023 |
Předmět: |
effective supervised learning
Radiological and Ultrasound Technology 3D auto-segmentation Manchester Cancer Research Centre ResearchInstitutes_Networks_Beacons/mcrc convolutional neural network Radiology Nuclear Medicine and imaging training annotation consistency medical image analysis small dataset |
Zdroj: | Henderson, E, Vasquez osorio, E, Van herk, M, Brouwer, C L, Steenbakkers, R J H M & Green, A F 2023, ' Accurate segmentation of head and neck radiotherapy CT scans with 3D CNNs: consistency is key ', Physics in Medicine & Biology . https://doi.org/10.1088/1361-6560/acc309 Physics in Medicine and Biology, 68(8):085003. IOP PUBLISHING LTD |
ISSN: | 0031-9155 |
DOI: | 10.1088/1361-6560/acc309 |
Popis: | Objective. Automatic segmentation of organs-at-risk in radiotherapy planning computed tomography (CT) scans using convolutional neural networks (CNNs) is an active research area. Very large datasets are usually required to train such CNN models. In radiotherapy, large, high-quality datasets are scarce and combining data from several sources can reduce the consistency of training segmentations. It is therefore important to understand the impact of training data quality on the performance of auto-segmentation models for radiotherapy. Approach. In this study, we took an existing 3D CNN architecture for head and neck CT auto-segmentation and compare the performance of models trained with a small, well-curated dataset (n = 34) and then a far larger dataset (n = 185) containing less consistent training segmentations. We performed 5-fold cross-validations in each dataset and tested segmentation performance using the 95th percentile Hausdorff distance and mean distance-to-agreement metrics. Finally, we validated the generalisability of our models with an external cohort of patient data (n = 12) with five expert annotators. Main results. The models trained with a large dataset were greatly outperformed by models (of identical architecture) trained with a smaller, but higher consistency set of training samples. Our models trained with a small dataset produce segmentations of similar accuracy as expert human observers and generalised well to new data, performing within inter-observer variation. Significance. We empirically demonstrate the importance of highly consistent training samples when training a 3D auto-segmentation model for use in radiotherapy. Crucially, it is the consistency of the training segmentations which had a greater impact on model performance rather than the size of the dataset used. |
Databáze: | OpenAIRE |
Externí odkaz: |