Popis: |
Purpose/Background: We externally validated the Normal Tissue Complication Probability (NTCP) grade II-IV at 6 months dysphagia model for head and neck cancer patients included in the Dutch National Indication Protocol for Proton Therapy (NIPP) using an independent patients’ cohort treated with (chemo)radiotherapy in MAASTRO clinic.Materials/Methods: We used 277 head and neck cancer patients treated with (chemo)radiotherapy in MAASTRO clinic between 2019-2021. For the evaluation of the model discrimination we used statistical metrics such as the sensitivity, specificity and the area under the receiver operating characteristic (ROC) curve. After validation we evaluated if the NTCP model can be improved using the closed testing procedure (CTP). Specifically, we used calibration curves to graphically assess the i) Original model, ii) Recalibration in the large, iii) Recalibration and iv) Model revision). The discrimination and calibration performance of the models was also quantitatively assessed by statistical metrics such as the Brier score, the area under the receiver operating characteristic curve, and the Hosmer–Lemeshow test.Results: The performance of the original NTCP model for dysphagia grade II-IV at 6 months was good in the independent cohort of MAASTRO clinic (AUC=0.80) but according to its calibration curve, it was underestimating the risk of the head and neck patients to develop dysphagia. Therefore, we implemented the CTP. The CTP indicated that the model had to be updated and selected a revised model with updated predictor coefficients as an updated model. The revised model had also satisfactory discrimination in MAASTRO’s cohort (AUC=0.83) with an improved calibration of predicted and observed NTCP values. Furthermore, the Hosmer–Lemeshow test for the quantitative calibration assessment, showed that the updated revised model presented the best-calibrated distribution of the predicted and observed NTCP values compared with the other models assessed by the CTP, as there was not a statistically significant difference between them (p value=0.98). Moreover, the brier score of the revised model (01.15) was the lowest among all the different calibrated models indicating higher accuracy.Conclusion: The validation of the NIPP NTCP model for grade II-IV dysphagia was successful in our independent validation cohort but can be improved using the CTP. Future steps include the participation of more independent radiotherapy centres for the validation of the NIPP NTCP models through a federated learning approach using the personal health train (PHT) infrastructure. |