Deep learning in computed tomography super resolution using multi‐modality data training.

Autor: Fok, Wai Yan Ryana, Fieselmann, Andreas, Herbst, Magdalena, Ritschl, Ludwig, Kappler, Steffen, Saalfeld, Sylvia
Předmět:
Zdroj: Medical Physics; Apr2024, Vol. 51 Issue 4, p2846-2860, 15p
Abstrakt: Background: One of the limitations in leveraging the potential of artificial intelligence in X‐ray imaging is the limited availability of annotated training data. As X‐ray and CT shares similar imaging physics, one could achieve cross‐domain data sharing, so to generate labeled synthetic X‐ray images from annotated CT volumes as digitally reconstructed radiographs (DRRs). To account for the lower resolution of CT and the CT‐generated DRRs as compared to the real X‐ray images, we propose the use of super‐resolution (SR) techniques to enhance the CT resolution before DRR generation. Purpose: As spatial resolution can be defined by the modulation transfer function kernel in CT physics, we propose to train a SR network using paired low‐resolution (LR) and high‐resolution (HR) images by varying the kernel's shape and cutoff frequency. This is different to previous deep learning‐based SR techniques on RGB and medical images which focused on refining the sampling grid. Instead of generating LR images by bicubic interpolation, we aim to create realistic multi‐detector CT (MDCT) like LR images from HR cone‐beam CT (CBCT) scans. Methods: We propose and evaluate the use of a SR U‐Net for the mapping between LR and HR CBCT image slices. We reconstructed paired LR and HR training volumes from the same CT scans with small in‐plane sampling grid size of 0.20×0.20mm2$0.20 \times 0.20 \, {\rm mm}^2$. We used the residual U‐Net architecture to train two models. SRUNResK$^K_{Res}$: trained with kernel‐based LR images, and SRUNResI$^I_{Res}$: trained with bicubic downsampled data as baseline. Both models are trained on one CBCT dataset (n = 13 391). The performance of both models was then evaluated on unseen kernel‐based and interpolation‐based LR CBCT images (n = 10 950), and also on MDCT images (n = 1392). Results: Five‐fold cross validation and ablation study were performed to find the optimal hyperparameters. Both SRUNResK$^K_{Res}$ and SRUNResI$^I_{Res}$ models show significant improvements (p‐value <$<$ 0.05) in mean absolute error (MAE), peak signal‐to‐noise ratio (PSNR) and structural similarity index measures (SSIMs) on unseen CBCT images. Also, the improvement percentages in MAE, PSNR, and SSIM by SRUNResK$^K_{Res}$ is larger than SRUNResI$^I_{Res}$. For SRUNResK$^K_{Res}$, MAE is reduced by 14%, and PSNR and SSIMs increased by 6 and 8%, respectively. To conclude, SRUNResK$^K_{Res}$ outperforms SRUNResI$^I_{Res}$, which the former generates sharper images when tested with kernel‐based LR CBCT images as well as cross‐modality LR MDCT data. Conclusions: Our proposed method showed better performance than the baseline interpolation approach on unseen LR CBCT. We showed that the frequency behavior of the used data is important for learning the SR features. Additionally, we showed cross‐modality resolution improvements to LR MDCT images. Our approach is, therefore, a first and essential step in enabling realistic high spatial resolution CT‐generated DRRs for deep learning training. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index