Feature Selection Strategy for Intrahospital Mortality Prediction after Coronary Artery Bypass Graft Surgery on an Unbalanced Sample

Autor: Alexandra Kriger, Vladislav Rublev, Basil Shirobokov, Karina J. Shakhgeldyan, Dan Geltser, B. I. Geltser
Rok vydání: 2020
Předmět:
Zdroj: CSAE
DOI: 10.1145/3424978.3425090
Popis: The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.
Databáze: OpenAIRE