Autor: |
Zhu CX; Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China., Song YX; Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China., Hao YT; Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China., Chen F; Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China., Wei YY; Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China. |
Abstrakt: |
The disease risk prediction model is the basis of precision prevention and an essential reference for clinical treatment decisions. The development of risk prediction models requires the support of a large amount of high-quality data. A large population cohort study is an important basis for this study. The United Kingdom Biobank (UKB), as a mega-population cohort and biobank, has played an essential role in the exploration of disease etiology and research related to disease prevention and control, with its rich baseline and follow-up data and concepts and mechanisms shared globally. This study followed PRISMA guidelines and included 210 articles with corresponding authors from 18 countries, of which 58 (27.62%) were from the UKB. A total of 491 disease risk prediction models were extracted for cancer, cardiovascular and cerebrovascular diseases, endocrine and metabolic diseases, respiratory diseases, and other diseases and their subgroups, of which 132 were developed by UKB without validation, 183 were developed by UKB with internal validation, 17 were developed by UKB with external validation, and 159 were developed by external development with UKB validation. A total of 188 models used only macro variables (38.29%), and 303 models combined macro and micro variables (61.71%). Model construction methods included survival outcome models, logistic regression, and machine learning. Survival outcome models were dominated by Cox proportional risk regression models and a few models considering competitive risk, accelerated failure models, or different baseline risk functions. Machine learning models included random forest, XGBoost, CatBoost, support vector machine, convolutional neural network, and other methods. The UKB is an essential resource for multiple disease risk prediction modeling studies. |