Foresight-a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study.

Autor: Kraljevic Z; Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; National Institute for Health and Care Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, UK., Bean D; Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; National Institute for Health and Care Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, UK., Shek A; Department of Neurology, King's College Hospital National Health Service (NHS) Foundation Trust, London, UK; Guy's and St Thomas' NHS Foundation Trust, London, UK., Bendayan R; Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; National Institute for Health and Care Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, UK., Hemingway H; Health Data Research UK London and Institute of Health Informatics, University College London, London, UK; NIHR Biomedical Research Centre at University College London Hospitals NHS Foundation Trust, London, UK., Yeung JA; Department of Neurology, King's College Hospital National Health Service (NHS) Foundation Trust, London, UK; Guy's and St Thomas' NHS Foundation Trust, London, UK., Deng A; Guy's and St Thomas' NHS Foundation Trust, London, UK., Baston A; Guy's and St Thomas' NHS Foundation Trust, London, UK., Ross J; Guy's and St Thomas' NHS Foundation Trust, London, UK., Idowu E; Guy's and St Thomas' NHS Foundation Trust, London, UK., Teo JT; Department of Neurology, King's College Hospital National Health Service (NHS) Foundation Trust, London, UK; Guy's and St Thomas' NHS Foundation Trust, London, UK., Dobson RJB; Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; Health Data Research UK London and Institute of Health Informatics, University College London, London, UK; National Institute for Health and Care Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, UK; NIHR Biomedical Research Centre at University College London Hospitals NHS Foundation Trust, London, UK. Electronic address: richard.j.dobson@kcl.ac.uk.
Jazyk: angličtina
Zdroj: The Lancet. Digital health [Lancet Digit Health] 2024 Apr; Vol. 6 (4), pp. e281-e290.
DOI: 10.1016/S2589-7500(24)00025-6
Abstrakt: Background: An electronic health record (EHR) holds detailed longitudinal information about a patient's health status and general clinical history, a large portion of which is stored as unstructured, free text. Existing approaches to model a patient's trajectory focus mostly on structured data and a subset of single-domain outcomes. This study aims to evaluate the effectiveness of Foresight, a generative transformer in temporal modelling of patient data, integrating both free text and structured formats, to predict a diverse array of future medical outcomes, such as disorders, substances (eg, to do with medicines, allergies, or poisonings), procedures, and findings (eg, relating to observations, judgements, or assessments).
Methods: Foresight is a novel transformer-based pipeline that uses named entity recognition and linking tools to convert EHR document text into structured, coded concepts, followed by providing probabilistic forecasts for future medical events, such as disorders, substances, procedures, and findings. The Foresight pipeline has four main components: (1) CogStack (data retrieval and preprocessing); (2) the Medical Concept Annotation Toolkit (structuring of the free-text information from EHRs); (3) Foresight Core (deep-learning model for biomedical concept modelling); and (4) the Foresight web application. We processed the entire free-text portion from three different hospital datasets (King's College Hospital [KCH], South London and Maudsley [SLaM], and the US Medical Information Mart for Intensive Care III [MIMIC-III]), resulting in information from 811 336 patients and covering both physical and mental health institutions. We measured the performance of models using custom metrics derived from precision and recall.
Findings: Foresight achieved a precision@10 (ie, of 10 forecasted candidates, at least one is correct) of 0·68 (SD 0·0027) for the KCH dataset, 0·76 (0·0032) for the SLaM dataset, and 0·88 (0·0018) for the MIMIC-III dataset, for forecasting the next new disorder in a patient timeline. Foresight also achieved a precision@10 value of 0·80 (0·0013) for the KCH dataset, 0·81 (0·0026) for the SLaM dataset, and 0·91 (0·0011) for the MIMIC-III dataset, for forecasting the next new biomedical concept. In addition, Foresight was validated on 34 synthetic patient timelines by five clinicians and achieved a relevancy of 33 (97% [95% CI 91-100]) of 34 for the top forecasted candidate disorder. As a generative model, Foresight can forecast follow-on biomedical concepts for as many steps as required.
Interpretation: Foresight is a general-purpose model for biomedical concept modelling that can be used for real-world risk forecasting, virtual trials, and clinical research to study the progression of disorders, to simulate interventions and counterfactuals, and for educational purposes.
Funding: National Health Service Artificial Intelligence Laboratory, National Institute for Health and Care Research Biomedical Research Centre, and Health Data Research UK.
Competing Interests: Declaration of interests DB was employed at AstraZeneca, after manuscript preparation. RJBD and JTT are co-founders and directors of CogStack. All other authors declare no competing interests.
(Copyright © 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.)
Databáze: MEDLINE