892-P: Generating Intercorrelated Data for Simulation Samples in Diabetes Microsimulation Experiment

Autor: DAWEI GUAN, JI-HYUN LEE, XIANGYANG LOU, JIANG BIAN, YI GUO, JINGCHUAN GUO, HUI SHAO
Rok vydání: 2022
Předmět:
Zdroj: Diabetes. 71
ISSN: 0012-1797
2013-2018
Popis: Traditional data generating processes (DGP) in simulation treat input variables as independently distributed. Ignoring the intercorrelation between variables would reduce the accuracy of the simulation. A common solution is to use the correlation coefficient between variables to correct the DGP. However, challenges exist when the correlated variables are from different distribution types. Existing solutions restrict the correlation coefficients to cope with these challenges, limiting their practicality. This study aimed to validate a novel algorithm to relax the above restrictions. We have identified 426 individuals with self-reported diabetes from the National Health and Nutrition Examination Survey (2013-2018) . We extracted the distribution parameters (mean and SD) and the correlation coefficient matrix of 18 variables often used to characterize diabetes population, including demographics, biomarkers, and disease histories. Our algorithm involves two steps: all parameters were first simulated as correlated numeric variables with normal distribution, and then converted into values that follow their corresponding distributions (e.g., Bernoulli) . Simulation precision was evaluated by comparing observed and simulated data. There were no significant differences in any variables between the observed and the simulated data (all p> 0.05) . The mean (SE) absolute difference between observed and simulated correlation coefficients was 0.032 (0.002) . Among the 153 correlation coefficients, the differences between observed and simulated were less than 0.in 1 (77.8%) coefficients, between 0.and 0.1 in 30 (19.6%) coefficients, and higher than 0.1 in 4 (2.6%) coefficients. The precision levels were consistent across subgroups stratified by age or cardiovascular disease. Our algorithm, with relaxed restriction on intercorrelation, can accurately generate data close to the observed data based on distribution parameters and correlation coefficient matrix. Disclosure D.Guan: None. J.Lee: None. X.Lou: None. J.Bian: None. Y.Guo: None. J.Guo: None. H.Shao: Board Member; BRAVO4HEALTH, LLC.
Databáze: OpenAIRE