Popis: |
BackgroundReducing care lapses for people living with HIV is critical to ending the HIV epidemic and beneficial for their health. Predictive modeling can identify clinical factors associated with HIV care lapses. Previous studies have identified these factors within a single clinic or using a national network of clinics, but public health strategies to improve retention in care in the United States often occur within a regional jurisdiction (eg, a city or county). ObjectiveWe sought to build predictive models of HIV care lapses using a large, multisite, noncurated database of electronic health records (EHRs) in Chicago, Illinois. MethodsWe used 2011-2019 data from the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN), a database including multiple health systems, covering the majority of 23,580 people with an HIV diagnosis living in Chicago. CAPriCORN uses a hash-based data deduplication method to follow people across multiple Chicago health care systems with different EHRs, providing a unique citywide view of retention in HIV care. From the database, we used diagnosis codes, medications, laboratory tests, demographics, and encounter information to build predictive models. Our primary outcome was lapses in HIV care, defined as having more than 12 months between subsequent HIV care encounters. We built logistic regression, random forest, elastic net logistic regression, and XGBoost models using all variables and compared their performance to a baseline logistic regression model containing only demographics and retention history. ResultsWe included people living with HIV with at least 2 HIV care encounters in the database, yielding 16,930 people living with HIV with 191,492 encounters. All models outperformed the baseline logistic regression model, with the most improvement from the XGBoost model (area under the receiver operating characteristic curve 0.776, 95% CI 0.768-0.784 vs 0.674, 95% CI 0.664-0.683; P |