Improving predictive performance in incident heart failure using machine learning and multi-center data

Autor:	František Sabovčik, Evangelos Ntalianis, Nicholas Cauwenberghs, Tatiana Kuznetsova
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	heart failure incidence machine learning prediction model multi-center data Diseases of the circulatory (Cardiovascular) system RC666-701
Zdroj:	Frontiers in Cardiovascular Medicine, Vol 9 (2022)
Druh dokumentu:	article
ISSN:	2297-055X
DOI:	10.3389/fcvm.2022.1011071
Popis:	ObjectiveTo mitigate the burden associated with heart failure (HF), primary prevention is of the utmost importance. To improve early risk stratification, advanced computational methods such as machine learning (ML) capturing complex individual patterns in large data might be necessary. Therefore, we compared the predictive performance of incident HF risk models in terms of (a) flexible ML models and linear models and (b) models trained on a single cohort (single-center) and on multiple heterogeneous cohorts (multi-center).Design and methodsIn our analysis, we used the meta-data consisting of 30,354 individuals from 6 cohorts. During a median follow-up of 5.40 years, 1,068 individuals experienced a non-fatal HF event. We evaluated the predictive performance of survival gradient boosting (SGB), CoxNet, the PCP-HF risk score, and a stacking method. Predictions were obtained iteratively, in each iteration one cohort serving as an external test set and either one or all remaining cohorts as a training set (single- or multi-center, respectively).ResultsOverall, multi-center models systematically outperformed single-center models. Further, c-index in the pooled population was higher in SGB (0.735) than in CoxNet (0.694). In the precision-recall (PR) analysis for predicting 10-year HF risk, the stacking method, combining the SGB, CoxNet, Gaussian mixture and PCP-HF models, outperformed other models with PR/AUC 0.804, while PCP-HF achieved only 0.551.ConclusionWith a greater number and variety of training cohorts, the model learns a wider range of specific individual health characteristics. Flexible ML algorithms can be used to capture these diverse distributions and produce more precise prediction models.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/ddf727581af04f7cbde1a0bebac2c936 Zobrazit plný text záznamu View record in DOAJ