Development and validation of 15-month mortality prediction models: a retrospective observational comparison of machine-learning techniques in a national sample of Medicare recipients

Autor: Virginia F Gurley, Gregory D. Berg
Rok vydání: 2019
Předmět:
Zdroj: BMJ Open
ISSN: 2044-6055
DOI: 10.1136/bmjopen-2018-022935
Popis: ObjectiveThe objective is to develop and validate a predictive model for 15-month mortality using a random sample of community-dwelling Medicare beneficiaries.Data sourceThe Centres for Medicare & Medicaid Services’ Limited Data Set files containing the five per cent samples for 2014 and 2015.ParticipantsThe data analysed contains de-identified administrative claims information at the beneficiary level, including diagnoses, procedures and demographics for 2.7 million beneficiaries.SettingUS national sample of Medicare beneficiaries.Study designEleven different models were used to predict 15-month mortality risk: logistic regression (using both stepwise and least absolute shrinkage and selection operator (LASSO) selection of variables as well as models using an age gender baseline, Charlson scores, Charlson conditions, Elixhauser conditions and all variables), naïve Bayes, decision tree with adaptive boosting, neural network and support vector machines (SVMs) validated by simple cross validation. Updated Charlson score weights were generated from the predictive model using only Charlson conditions.Primary outcome measureC-statistic.ResultsThe c-statistics was 0.696 for the naïve Bayes model and 0.762 for the decision tree model. For models that used the Charlson score or the Charlson variables the c-statistic was 0.713 and 0.726, respectively, similar to the model using Elixhauser conditions of 0.734. The c-statistic for the SVM model was 0.788 while the four models that performed the best were the logistic regression using all variables, logistic regression after selection of variables by the LASSO method, the logistic regression using a stepwise selection of variables and the neural network with c-statistics of 0.798, 0.798, 0.797 and 0.795, respectively.ConclusionsImproved means for identifying individuals in the last 15 months of life is needed to improve the patient experience of care and reducing the per capita cost of healthcare. This study developed and validated a predictive model for 15-month mortality with higher generalisability than previous administrative claims-based studies.
Databáze: OpenAIRE