External Validation of Mortality Prediction Models for Critical Illness Reveals Preserved Discrimination but Poor Calibration.

Autor: Cox EGM; Department of Critical Care, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands., Wiersema R; Department of Critical Care, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.; Department of Cardiology, Erasmus University Medical Center, Rotterdam, The Netherlands., Eck RJ; Department of Internal Medicine, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands., Kaufmann T; Department of Anesthesiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands., Granholm A; Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark., Vaara ST; Division of Intensive Care Medicine, Department of Anesthesiology, Intensive Care and Pain Medicine, University of Helsinki and Helsinki University Hospital, Helsinki, Finland., Møller MH; Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark.; Collaboration for Research in Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark., van Bussel BCT; Department of Intensive Care Medicine, Maastricht University, Maastricht University Medical Center+, Maastricht, The Netherlands.; Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, The Netherlands., Snieder H; Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands., Pleijhuis RG; Department of Internal Medicine, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands., van der Horst ICC; Department of Intensive Care Medicine, Maastricht University, Maastricht University Medical Center+, Maastricht, The Netherlands.; Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands., Keus F; Department of Critical Care, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
Jazyk: angličtina
Zdroj: Critical care medicine [Crit Care Med] 2023 Jan 01; Vol. 51 (1), pp. 80-90. Date of Electronic Publication: 2022 Nov 15.
DOI: 10.1097/CCM.0000000000005712
Abstrakt: Objectives: In a recent scoping review, we identified 43 mortality prediction models for critically ill patients. We aimed to assess the performances of these models through external validation.
Design: Multicenter study.
Setting: External validation of models was performed in the Simple Intensive Care Studies-I (SICS-I) and the Finnish Acute Kidney Injury (FINNAKI) study.
Patients: The SICS-I study consisted of 1,075 patients, and the FINNAKI study consisted of 2,901 critically ill patients.
Measurements and Main Results: For each model, we assessed: 1) the original publications for the data needed for model reconstruction, 2) availability of the variables, 3) model performance in two independent cohorts, and 4) the effects of recalibration on model performance. The models were recalibrated using data of the SICS-I and subsequently validated using data of the FINNAKI study. We evaluated overall model performance using various indexes, including the (scaled) Brier score, discrimination (area under the curve of the receiver operating characteristics), calibration (intercepts and slopes), and decision curves. Eleven models (26%) could be externally validated. The Acute Physiology And Chronic Health Evaluation (APACHE) II, APACHE IV, Simplified Acute Physiology Score (SAPS)-Reduced (SAPS-R)' and Simplified Mortality Score for the ICU models showed the best scaled Brier scores of 0.11' 0.10' 0.10' and 0.06' respectively. SAPS II, APACHE II, and APACHE IV discriminated best; overall discrimination of models ranged from area under the curve of the receiver operating characteristics of 0.63 (0.61-0.66) to 0.83 (0.81-0.85). We observed poor calibration in most models, which improved to at least moderate after recalibration of intercepts and slopes. The decision curve showed a positive net benefit in the 0-60% threshold probability range for APACHE IV and SAPS-R.
Conclusions: In only 11 out of 43 available mortality prediction models, the performance could be studied using two cohorts of critically ill patients. External validation showed that the discriminative ability of APACHE II, APACHE IV, and SAPS II was acceptable to excellent, whereas calibration was poor.
Competing Interests: Drs. Granholm and Møller were involved in the development of one of the prediction models included. Dr. Pleijhuis reports a minority stake in Evidencio B.V., an online platform offering free services regarding the creation, validation, and implementation of clinical prediction models. Evidencio was not involved in the development of any of the prediction models mentioned, nor is it expected to benefit financially by the publication of this manuscript. Dr. Granholm received funding from Sygeforsikringen “Danmark”; he disclosed that he was involved in the development of one of the prediction models assessed in the study. The remaining authors have disclosed that they do not have any potential conflicts of interest.
(Copyright © 2022 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.)
Databáze: MEDLINE