Popis: |
Allele frequency distribution (AFD) is the summarized distribution of allele frequencies of genetic loci in the studied population. AFD contains important information of population demographic history and plays a crucial in the efficient conduct of genetic association studies. Unlike the allele frequency spectrum (AFS), which is a sample level concept and has received much attention, few studies have been examined AFD due to the limitation of empirical data and computational tools. In this dissertation, we investigated AFD and its related problems relevant to genome-wide association (GWA) studies. First, we established an empirical method for estimating AFD based on observable AFS data. The method is proved to be effective and efficient. Based on data from the ‘Program for Genomic Association’ (PGA) project and HapMap ENCODE project, we estimated AFD for European and African populations to be used for further analysis. We next brought up an AFD-like complex disease model which is the different from the long-debated ‘Common disease common variant” (CDCV) and “Common disease rare variant” (CDRV) model. This model is theoretically reasonable and it is compatible with observable results from human Genome-wide association (GWA) studies. Finally, we compared statistical power for common frequentist’s test methods and Bayesian methods in GWA studies using the simulation strategy on our AFD and complex disease model. To avoid complicated multiple testing problem, instead of traditional power, we used ‘Rank Power’ which is based on the probability of true alternative hypotheses given first N ranked hypotheses are declared to be significant. The results showed that current test methods share the similar power and the improvement of Bayesian methods in GWA studies is marginal. Results of this study further augment the analytical principles and methods involved in complex disease genetic studies and in the development of efficient designs and providing statistical solutions for GWA studies. |