How to Identify Potential Candidates for HIV Pre-Exposure Prophylaxis: An AI Algorithm Reusing Real-World Hospital Data

Autor: Emmanuelle Sylvestre, Marc Cuggia, Guillaume Bouzillé, Jean-Charles Duthe, Cedric Arvieux, Emmanuel Chazard
Přispěvatelé: Laboratoire Traitement du Signal et de l'Image (LTSI), Université de Rennes (UR)-Institut National de la Santé et de la Recherche Médicale (INSERM), CHU Pontchaillou [Rennes], CHU Lille, Evaluation des technologies de santé et des pratiques médicales - ULR 2694 (METRICS), Université de Lille-Centre Hospitalier Régional Universitaire [Lille] (CHRU Lille), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de la Santé et de la Recherche Médicale (INSERM), Jonchère, Laurent
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Studies in Health Technology and Informatics
Studies in Health Technology and Informatics, 2021, 281, pp.714-718. ⟨10.3233/SHTI210265⟩
Studies in Health Technology and Informatics, IOS Press, 2021, 281, pp.714-718. ⟨10.3233/SHTI210265⟩
MIE
ISSN: 0926-9630
1879-8365
DOI: 10.3233/SHTI210265⟩
Popis: International audience; HIV Pre-Exposure Prophylaxis (PrEP) is effective in Men who have Sex with Men (MSM), and is reimbursed by the social security in France. Yet, PrEP is underused due to the difficulty to identify people at risk of HIV infection outside the "sexual health" care path. We developed and validated an automated algorithm that re-uses Electronic Health Record (EHR) data available in eHOP, the Clinical Data Warehouse of Rennes University Hospital (France). Using machine learning methods, we developed five models to predict incident HIV infections with 162 variables that might be exploited to predict HIV risk using EHR data. We divided patients aged 18 or more having at least one hospital admission between 2013 and 2019 in two groups: cases (patients with known HIV infection in the study period) and controls (patients without known HIV infection and no PrEP in the study period, but with at least one HIV risk factor). Among the 624,708 admissions, we selected 156 cases (incident HIV infection) and 761 controls. The best performing model for identifying incident HIV infections was the combined model (LASSO, Random Forest, and Generalized Linear Model): AUC = 0.88 (95% CI: 0.8143-0.9619), specificity = 0.887, and sensitivity = 0.733 using the test dataset. The algorithm seems to efficiently identify patients at risk of HIV infection.
Databáze: OpenAIRE