Development and validation of 15-month mortality prediction models: a retrospective observational comparison of machine-learning techniques in a national sample of Medicare recipients
Autor: | Virginia F Gurley, Gregory D. Berg |
---|---|
Rok vydání: | 2019 |
Předmět: |
Male
Palliative care Decision tree Medicare Logistic regression Cross-validation Machine Learning terminal care 03 medical and health sciences Naive Bayes classifier 0302 clinical medicine Lasso (statistics) Statistics Humans Medicine 030212 general & internal medicine Mortality Aged Aged 80 and over Models Statistical business.industry Research 030503 health policy & services Decision Trees Palliative Care Bayes Theorem General Medicine achine learning Stepwise regression United States Logistic Models classification hospice care Female Neural Networks Computer 0305 other medical science business Decision tree model |
Zdroj: | BMJ Open |
ISSN: | 2044-6055 |
DOI: | 10.1136/bmjopen-2018-022935 |
Popis: | ObjectiveThe objective is to develop and validate a predictive model for 15-month mortality using a random sample of community-dwelling Medicare beneficiaries.Data sourceThe Centres for Medicare & Medicaid Services’ Limited Data Set files containing the five per cent samples for 2014 and 2015.ParticipantsThe data analysed contains de-identified administrative claims information at the beneficiary level, including diagnoses, procedures and demographics for 2.7 million beneficiaries.SettingUS national sample of Medicare beneficiaries.Study designEleven different models were used to predict 15-month mortality risk: logistic regression (using both stepwise and least absolute shrinkage and selection operator (LASSO) selection of variables as well as models using an age gender baseline, Charlson scores, Charlson conditions, Elixhauser conditions and all variables), naïve Bayes, decision tree with adaptive boosting, neural network and support vector machines (SVMs) validated by simple cross validation. Updated Charlson score weights were generated from the predictive model using only Charlson conditions.Primary outcome measureC-statistic.ResultsThe c-statistics was 0.696 for the naïve Bayes model and 0.762 for the decision tree model. For models that used the Charlson score or the Charlson variables the c-statistic was 0.713 and 0.726, respectively, similar to the model using Elixhauser conditions of 0.734. The c-statistic for the SVM model was 0.788 while the four models that performed the best were the logistic regression using all variables, logistic regression after selection of variables by the LASSO method, the logistic regression using a stepwise selection of variables and the neural network with c-statistics of 0.798, 0.798, 0.797 and 0.795, respectively.ConclusionsImproved means for identifying individuals in the last 15 months of life is needed to improve the patient experience of care and reducing the per capita cost of healthcare. This study developed and validated a predictive model for 15-month mortality with higher generalisability than previous administrative claims-based studies. |
Databáze: | OpenAIRE |
Externí odkaz: |