Machine Learning Methods for Identifying Atrial Fibrillation Cases and Their Predictors in Patients With Hypertrophic Cardiomyopathy: The HCM-AF-Risk Model.

Autor: Bhattacharya M; Computational Biomedicine and Machine Learning Lab, Department of Computer and Information Sciences, University of Delaware, Newark, Delaware, USA., Lu DY; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA.; Division of General Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.; Institute of Public Health, National Yang-Ming University, Taipei, Taiwan.; Hypertrophic Cardiomyopathy Center of Excellence, Division of Cardiology, University of California San Francisco, San Francisco, California, USA., Ventoulis I; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA., Greenland GV; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA.; Hypertrophic Cardiomyopathy Center of Excellence, Division of Cardiology, University of California San Francisco, San Francisco, California, USA., Yalcin H; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA., Guan Y; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA., Marine JE; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA., Olgin JE; Hypertrophic Cardiomyopathy Center of Excellence, Division of Cardiology, University of California San Francisco, San Francisco, California, USA., Zimmerman SL; Department of Radiology, Johns Hopkins University, Baltimore, Maryland, USA., Abraham TP; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA.; Hypertrophic Cardiomyopathy Center of Excellence, Division of Cardiology, University of California San Francisco, San Francisco, California, USA., Abraham MR; Hypertrophic Cardiomyopathy Center of Excellence, Johns Hopkins University, Baltimore, Maryland, USA.; Hypertrophic Cardiomyopathy Center of Excellence, Division of Cardiology, University of California San Francisco, San Francisco, California, USA., Shatkay H; Computational Biomedicine and Machine Learning Lab, Department of Computer and Information Sciences, University of Delaware, Newark, Delaware, USA.
Jazyk: angličtina
Zdroj: CJC open [CJC Open] 2021 Feb 02; Vol. 3 (6), pp. 801-813. Date of Electronic Publication: 2021 Feb 02 (Print Publication: 2021).
DOI: 10.1016/j.cjco.2021.01.016
Abstrakt: Background: Hypertrophic cardiomyopathy (HCM) patients have a high incidence of atrial fibrillation (AF) and increased stroke risk, even with low CHA 2 DS 2 -VASc (congestive heart failure, hypertension, age diabetes, previous stroke/transient ischemic attack) scores. Hence, there is a need to understand the pathophysiology of AF/stroke in HCM. In this retrospective study, we develop and apply a data-driven, machine learning-based method to identify AF cases, and clinical/imaging features associated with AF, using electronic health record data.
Methods: HCM patients with documented paroxysmal/persistent/permanent AF (n = 191) were considered AF cases, and the remaining patients in sinus rhythm (n = 640) were tagged as No-AF. We evaluated 93 clinical variables; the most informative variables useful for distinguishing AF from No-AF cases were selected based on the 2-sample t test and the information gain criterion.
Results: We identified 18 highly informative variables that are positively (n = 11) and negatively (n = 7) correlated with AF in HCM. Next, patient records were represented via these 18 variables. Data imbalance resulting from the relatively low number of AF cases was addressed via a combination of oversampling and undersampling strategies. We trained and tested multiple classifiers under this sampling approach, showing effective classification. Specifically, an ensemble of logistic regression and naïve Bayes classifiers, trained based on the 18 variables and corrected for data imbalance, proved most effective for separating AF from No-AF cases (sensitivity = 0.74, specificity = 0.70, C-index = 0.80).
Conclusions: Our model (HCM-AF-Risk Model) is the first machine learning-based method for identification of AF cases in HCM. This model demonstrates good performance, addresses data imbalance, and suggests that AF is associated with a more severe cardiac HCM phenotype.
(© 2021 The Authors.)
Databáze: MEDLINE