Combining Contrast Mining with Logistic Regression To Predict Healthcare Utilization in a Managed Care Population
Autor: | Lincoln Sheets, Chi-Ren Shyu, Gregory F. Petroski, Jerry C. Parker, Michael A. Phinney, Yan Zhuang, Bin Ge |
---|---|
Rok vydání: | 2017 |
Předmět: |
Population
Health Informatics Population health Logistic regression 03 medical and health sciences 0302 clinical medicine Health Information Management Environmental health Health care Data Mining Electronic Health Records Humans Medicine 030212 general & internal medicine education education.field_of_study business.industry 030503 health policy & services Managed Care Programs Predictive analytics Data science Computer Science Applications Logistic Models Managed care 0305 other medical science business Delivery of Health Care Medicaid Predictive modelling |
Zdroj: | Applied Clinical Informatics. :430-446 |
ISSN: | 1869-0327 |
DOI: | 10.4338/aci-2016-05-ra-0078 |
Popis: | SummaryBackground: Because 5% of patients incur 50% of healthcare expenses, population health managers need to be able to focus preventive and longitudinal care on those patients who are at highest risk of increased utilization. Predictive analytics can be used to identify these patients and to better manage their care. Data mining permits the development of models that surpass the size restrictions of traditional statistical methods and take advantage of the rich data available in the electronic health record (EHR), without limiting predictions to specific chronic conditions.Objective: The objective was to demonstrate the usefulness of unrestricted EHR data for predictive analytics in managed healthcare.Methods: In a population of 9,568 Medicare and Medicaid beneficiaries, patients in the highest 5% of charges were compared to equal numbers of patients with the lowest charges. Contrast mining was used to discover the combinations of clinical attributes frequently associated with high utilization and infrequently associated with low utilization. The attributes found in these combinations were then tested by multiple logistic regression, and the discrimination of the model was evaluated by the c-statistic.Results: Of 19,014 potential EHR patient attributes, 67 were found in combinations frequently associated with high utilization, but not with low utilization (support>20%). Eleven of these attributes were significantly associated with high utilization (pConclusions: EHR mining reduced an unusably high number of patient attributes to a manageable set of potential healthcare utilization predictors, without conjecturing on which attributes would be useful. Treating these results as hypotheses to be tested by conventional methods yielded a highly accurate predictive model. This novel, two-step methodology can assist population health managers to focus preventive and longitudinal care on those patients who are at highest risk for increased utilization.Citation: Sheets L, Petroski GF, Zhuang Y, Phinney MA, Ge B, Parker JC, Shyu C-R. Combining contrast mining with logistic regression to predict healthcare Appl Clin Inform 2017; 8: 430–446 https://doi.org/10.4338/ACI-2016-05-RA-0078 |
Databáze: | OpenAIRE |
Externí odkaz: |