The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data

Autor: Johannes Voegel, Jérôme Aubert, Anton Petrov, Julie Moulin, Nicolas Antille, Robert Mansourian, Paul Fogel, Andreas Rytz, Jean-Marc Le Goff, David M. Mutch, Matthew-Alan Roberts
Rok vydání: 2004
Předmět:
Zdroj: Bioinformatics. 20:2726-2737
ISSN: 1367-4811
1367-4803
Popis: Motivation: Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. Results: The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-γ treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription–polymerase chain reaction (RT–PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. Availability: The GEA code for R software is freely available upon request to authors.
Databáze: OpenAIRE