Shapely additive values can effectively visualize pertinent covariates in machine learning when predicting hypertension

Autor:	Alexander A. Huang, Samuel Y. Huang
Jazyk:	angličtina
Rok vydání:	2023
Předmět:	cardiology hypertension machine learning model transparency SHAP statistics Diseases of the circulatory (Cardiovascular) system RC666-701
Zdroj:	The Journal of Clinical Hypertension, Vol 25, Iss 12, Pp 1135-1144 (2023)
Druh dokumentu:	article
ISSN:	1751-7176 1524-6175
DOI:	10.1111/jch.14745
Popis:	Abstract Machine learning methods are widely used within the medical field to enhance prediction. However, little is known about the reliability and efficacy of these models to predict long‐term medical outcomes such as blood pressure using lifestyle factors, such as diet. The authors assessed whether machine‐learning techniques could accurately predict hypertension risk using nutritional information. A cross‐sectional study using data from the National Health and Nutrition Examination Survey (NHANES) between January 2017 and March 2020. XGBoost was used as the machine‐learning model of choice in this study due to its increased performance relative to other common methods within medical studies. Model prediction metrics (e.g., AUROC, Balanced Accuracy) were used to measure overall model efficacy, covariate Gain statistics (percentage each covariate contributes to the overall prediction) and SHapely Additive exPlanations (SHAP, method to visualize each covariate) were used to provide explanations to machine‐learning output and increase the transparency of this otherwise cryptic method. Of a total of 9650 eligible patients, the mean age was 41.02 (SD = 22.16), 4792 (50%) males, 4858 (50%) female, 3407 (35%) White patients, 2567 (27%) Black patients, 2108 (22%) Hispanic patients, and 981 (10%) Asian patients. From evaluation of model gain statistics, age was found to be the single strongest predictor of hypertension, with a gain of 53.1%. Additionally, demographic factors such as poverty and Black race were also strong predictors of hypertension, with gain of 4.33% and 4.18%, respectively. Nutritional Covariates contributed 37% to the overall prediction: Sodium, Caffeine, Potassium, and Alcohol intake being significantly represented within the model. Machine Learning can be used to predict hypertension.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/1f927d683f454ae2a07950ec47119e67 Zobrazit plný text záznamu Plný text View record in DOAJ