Autor: |
Santos Sde S; Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil., Galatro TF; Department of Neurology, School of Medicine, University of São Paulo, São Paulo, Brazil., Watanabe RA; Department of Neurology, School of Medicine, University of São Paulo, São Paulo, Brazil., Oba-Shinjo SM; Department of Neurology, School of Medicine, University of São Paulo, São Paulo, Brazil., Nagahashi Marie SK; Department of Neurology, School of Medicine, University of São Paulo, São Paulo, Brazil., Fujita A; Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil. |
Abstrakt: |
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their "importance" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed. |