Modelling and resampling based multiple testing with applications to genetics

Autor: Huang, Yifan
Jazyk: angličtina
Rok vydání: 2005
Předmět:
Druh dokumentu: Text
Popis: Multiple hypotheses testing is a common problem in practice. For instance, in microarray experiments, whether the goal is to select maintenance genes for normalization or to identify differentially expressed genes between samples, multiple genes are under consideration. Multiplicity inflates the type I error rate of the hypothesis testing, so we need to adjust the testing procedure to control the overly error rate. My research focuses on the strong control of Familywise Error Rate (FWER). There are mainly two different types of approaches to multiple testing. One is modelling based approach and the other non-modelling based. Modelling based approaches fit models to the data so that the joint distribution of the test statistics is tractable. Non-modelling based approaches consist of inequality based methods and resampling based methods. They require less or no information about the joint distribution of the test statistics. I have shown in Chapter 1 that frequently used Hochberg's step-up method is a special case of partition testing based on Simes' test. This is a new result. Hochberg's step-up method is an inequity based non-modelling partition testing. Modelling based partition testing is applicable whether the joint distribution of the test statistics is known or not. By applying modelling based partition testing when the joint distribution of test statistics is known, I illustrate that modelling based approaches are often more powerful than inequality based non-modelling approaches. In Chapter 2, I construct counterexamples to the validity of permutation test, demonstrating that the resampling based methods are often invalid. My results suggest recommendation of modelling based approaches. When the joint distribution of the test statistics is untractable, modelling followed by bootstrap can be applied. I use modelling followed by bootstrap in Chapter 3 to select maintenance genes for normalizing the gene expression data.
Databáze: Networked Digital Library of Theses & Dissertations