The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data
Autor: | Johannes Voegel, Jérôme Aubert, Anton Petrov, Julie Moulin, Nicolas Antille, Robert Mansourian, Paul Fogel, Andreas Rytz, Jean-Marc Le Goff, David M. Mutch, Matthew-Alan Roberts |
---|---|
Rok vydání: | 2004 |
Předmět: |
Statistics and Probability
Mean squared error Statistics as Topic Computational biology Biology computer.software_genre Biochemistry Statistical power Cell Line Interferon-gamma Robustness (computer science) Animals Humans Molecular Biology Selection (genetic algorithm) Oligonucleotide Array Sequence Analysis Skin Statistical hypothesis testing Analysis of Variance Models Statistical Models Genetic Microarray analysis techniques Gene Expression Profiling Sequence Analysis DNA Computer Science Applications Computational Mathematics Gene Expression Regulation Computational Theory and Mathematics Gene chip analysis Data mining DNA microarray Sequence Alignment computer Algorithms Software |
Zdroj: | Bioinformatics. 20:2726-2737 |
ISSN: | 1367-4811 1367-4803 |
Popis: | Motivation: Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. Results: The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-γ treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription–polymerase chain reaction (RT–PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. Availability: The GEA code for R software is freely available upon request to authors. |
Databáze: | OpenAIRE |
Externí odkaz: |