Exploring Non-random Sampling in Randomized Controlled Trials

Autor: Han-Chu Lien, 連韓竹
Rok vydání: 2019
Druh dokumentu: 學位論文 ; thesis
Popis: 107
Introduction: “Non-random sampling data” refers to RCT data without balanced baseline covariates between allocation groups, suggesting possible data anomalies. Recently, Carlisle (2017) proposed a screening method to detect possible non-random sampling in RCTs based on the theory that comparisons between allocation groups for baseline variables should produce a uniform distribution of p-values. However, some assumptions underlying this method is commonly violated in RCTs. The aim of the present study was to investigate the impact of violation of these assumptions on the validity of Carlisle’s method in detecting non-random sampling. Methods: Simulations and empirical assessment were conducted to explore the effect of violating method assumptions. In simulations, hypothetical RCT data were generated under the following three assumption-violating scenarios: correlated variables, non-normality data, or imprecisely reported data. P-values were obtained from comparisons between allocation groups using t-test or ANOVA. The validity of Carlisle’s method was determined through checking the uniformity of the p-value distribution. In empirical assessment, we examined the clinically important variables of all RCTs included in network meta-analysis of Tu (2012) and discussed the limitations of applying data detection. Results: Our simulations found inflation of type I error in all assumption-violating scenarios. The clustering effect of correlated variables was amplified when the number of variables increases. The skewed effect of non-normality data was weakened when the sample size increases, according to the central limit theorem. Imprecise report produced more similar data between groups, increasing the chance of a trial being incorrectly detected as unusual. This bias was amplified when the sample size increased. In empirical assessment, we found non-uniformly distributed p-values in CAL and in different study design groups. This result implied possible impact on baseline p-value distribution when applying different randomization designs. Conclusions: Carlisle’s method only performs well if the data are independent, normally distributed, and reported in good precision. Otherwise, with an inflation of type I error, the method is no longer valid. For those unusual RCTs detected by Carlisle’s method, further investigation should be pursued to confirm whether those data did not come from random samples, or the finding is just a false alarm.
Databáze: Networked Digital Library of Theses & Dissertations