Can uncertainty estimation predict segmentation performance in ultrasound bone imaging?

Autor: Pandey PU; School of Biomedical Engineering, University of British Columbia, Vancouver, Canada. prashant@ece.ubc.ca., Guy P; Department of Orthopaedics, Faculty of Medicine, University of British Columbia, Vancouver, Canada., Hodgson AJ; Department of Mechanical Engineering, University of British Columbia, Vancouver, Canada.
Jazyk: angličtina
Zdroj: International journal of computer assisted radiology and surgery [Int J Comput Assist Radiol Surg] 2022 May; Vol. 17 (5), pp. 825-832. Date of Electronic Publication: 2022 Apr 04.
DOI: 10.1007/s11548-022-02597-0
Abstrakt: Purpose: Segmenting bone surfaces in ultrasound (US) is a fundamental step in US-based computer-assisted orthopaedic surgeries. Neural network-based segmentation techniques are a natural choice for this, given promising results in related tasks. However, to gain widespread use, we must be able to know how much to trust segmentation networks during clinical deployment when ground-truth data is unavailable.
Methods: We investigated alternative ways to measure the uncertainty of trained networks by implementing a baseline U-Net trained on a large dataset, together with three uncertainty estimation modifications: Monte Carlo dropout, test time augmentation, and ensemble learning. We measured the segmentation performance, calibration quality, and the ability to predict segmentation performance on test data. We further investigated the effect of data quality on these measures.
Results: Overall, we found that ensemble learning with binary cross-entropy (BCE) loss achieved the best segmentation performance (mean Dice: 0.75-0.78 and RMS distance: 0.62-0.86mm) and the lowest calibration errors (mean: 0.22-0.28%). In contrast to previous studies of area or volumetric segmentation, we found that the resulting uncertainty measures are not reliable proxies for surface segmentation performance.
Conclusion: Our experiments indicate that a significant performance and confidence calibration boost can be achieved with ensemble learning and BCE loss, as tested on 13,687 US images containing various anatomies and imaging parameters. However, these techniques do not allow us to reliably predict future segmentation performance. The results of this study can be used to improve the calibration and performance of US segmentation networks.
(© 2022. CARS.)
Databáze: MEDLINE