Identifying subtypes of heart failure from three electronic health record sources with machine learning: an external, prognostic, and genetic validation study.
Autor: | Banerjee A; Institute of Health Informatics, University College London, London, UK; Health Data Research UK, London, UK; Barts Health NHS Trust, London, UK; Department of Cardiology, University College London Hospitals NHS Trust, London, UK; NIHR Biomedical Research Centre, University College London Hospitals NHS Trust, London, UK. Electronic address: ami.banerjee@ucl.ac.uk., Dashtban A; Institute of Health Informatics, University College London, London, UK., Chen S; Institute of Health Informatics, University College London, London, UK., Pasea L; Institute of Health Informatics, University College London, London, UK., Thygesen JH; Institute of Health Informatics, University College London, London, UK., Fatemifar G; Institute of Health Informatics, University College London, London, UK., Tyl B; Medical Affairs, Pharmaceuticals, Bayer HealthCare, Paris, France., Dyszynski T; Medical Affairs & Pharmacovigilance, Bayer AG, Berlin, Germany., Asselbergs FW; Institute of Health Informatics, University College London, London, UK; Health Data Research UK, London, UK; NIHR Biomedical Research Centre, University College London Hospitals NHS Trust, London, UK; Amsterdam University Medical Centers, Department of Cardiology, University of Amsterdam, Amsterdam, Netherlands., Lund LH; Division of Cardiology, Department of Medicine, Karolinska Institutet, Stockholm, Sweden; Heart and Vascular Theme, Karolinska University Hospital, Stockholm, Sweden., Lumbers T; Institute of Health Informatics, University College London, London, UK; Health Data Research UK, London, UK; Barts Health NHS Trust, London, UK; Department of Cardiology, University College London Hospitals NHS Trust, London, UK; NIHR Biomedical Research Centre, University College London Hospitals NHS Trust, London, UK., Denaxas S; Institute of Health Informatics, University College London, London, UK; Health Data Research UK, London, UK., Hemingway H; Institute of Health Informatics, University College London, London, UK; Health Data Research UK, London, UK; NIHR Biomedical Research Centre, University College London Hospitals NHS Trust, London, UK. |
---|---|
Jazyk: | angličtina |
Zdroj: | The Lancet. Digital health [Lancet Digit Health] 2023 Jun; Vol. 5 (6), pp. e370-e379. |
DOI: | 10.1016/S2589-7500(23)00065-1 |
Abstrakt: | Background: Machine learning has been used to analyse heart failure subtypes, but not across large, distinct, population-based datasets, across the whole spectrum of causes and presentations, or with clinical and non-clinical validation by different machine learning methods. Using our published framework, we aimed to discover heart failure subtypes and validate them upon population representative data. Methods: In this external, prognostic, and genetic validation study we analysed individuals aged 30 years or older with incident heart failure from two population-based databases in the UK (Clinical Practice Research Datalink [CPRD] and The Health Improvement Network [THIN]) from 1998 to 2018. Pre-heart failure and post-heart failure factors (n=645) included demographic information, history, examination, blood laboratory values, and medications. We identified subtypes using four unsupervised machine learning methods (K-means, hierarchical, K-Medoids, and mixture model clustering) with 87 of 645 factors in each dataset. We evaluated subtypes for (1) external validity (across datasets); (2) prognostic validity (predictive accuracy for 1-year mortality); and (3) genetic validity (UK Biobank), association with polygenic risk score (PRS) for heart failure-related traits (n=11), and single nucleotide polymorphisms (n=12). Findings: We included 188 800, 124 262, and 9573 individuals with incident heart failure from CPRD, THIN, and UK Biobank, respectively, between Jan 1, 1998, and Jan 1, 2018. After identifying five clusters, we labelled heart failure subtypes as (1) early onset, (2) late onset, (3) atrial fibrillation related, (4) metabolic, and (5) cardiometabolic. In the external validity analysis, subtypes were similar across datasets (c-statistics: THIN model in CPRD ranged from 0·79 [subtype 3] to 0·94 [subtype 1], and CPRD model in THIN ranged from 0·79 [subtype 1] to 0·92 [subtypes 2 and 5]). In the prognostic validity analysis, 1-year all-cause mortality after heart failure diagnosis (subtype 1 0·20 [95% CI 0·14-0·25], subtype 2 0·46 [0·43-0·49], subtype 3 0·61 [0·57-0·64], subtype 4 0·11 [0·07-0·16], and subtype 5 0·37 [0·32-0·41]) differed across subtypes in CPRD and THIN data, as did risk of non-fatal cardiovascular diseases and all-cause hospitalisation. In the genetic validity analysis the atrial fibrillation-related subtype showed associations with the related PRS. Late onset and cardiometabolic subtypes were the most similar and strongly associated with PRS for hypertension, myocardial infarction, and obesity (p<0·0009). We developed a prototype app for routine clinical use, which could enable evaluation of effectiveness and cost-effectiveness. Interpretation: Across four methods and three datasets, including genetic data, in the largest study of incident heart failure to date, we identified five machine learning-informed subtypes, which might inform aetiological research, clinical risk prediction, and the design of heart failure trials. Funding: European Union Innovative Medicines Initiative-2. Competing Interests: Declaration of interests BT is an employee of Bayer and was previously an employee of Servier. TD is an employee of Bayer. AB is supported by research funding from the National Institute for Health Research (NIHR), British Medical Association, AstraZeneca, and UK Research and Innovation. HH is supported by Health Data Research UK (grant number LOND1), which is funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation, and Wellcome Trust; and is a NIHR Senior Investigator. AB, SD, FWA, and HH are funded by the NIHR University College London Hospitals Biomedical Research Centre. (Copyright © 2023 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |