Genome-Wide Co-Expression Distributions as a Metric to Prioritize Genes of Functional Importance
Autor: | Nicholas J. Hudson, Sigrid A. Lehnert, Marina Naval-Sanchez, Loan T. Nguyen, Laercio R. Porto-Neto, Antonio Reverter, Marina R. S. Fortes, Pâmela A. Alexandre |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
lcsh:QH426-470 Computer science Variation (game tree) Computational biology Biology Genome Article 03 medical and health sciences 0302 clinical medicine transcriptome analysis Genetics Animals Humans Gene Regulatory Networks Differential (infinitesimal) Gene Genetics (clinical) Regulator gene Regulation of gene expression Sequence Analysis RNA Gene Expression Profiling Computational Biology correlated gene expression Phenotype Expression (mathematics) Hypergeometric distribution lcsh:Genetics Distribution (mathematics) 030104 developmental biology Ducks 030220 oncology & carcinogenesis Metric (mathematics) Pairwise comparison Cattle Drosophila Transcriptome gene regulation |
Zdroj: | Genes, Vol 11, Iss 1231, p 1231 (2020) Genes Volume 11 Issue 10 |
ISSN: | 2073-4425 |
Popis: | Genome-wide gene expression is routinely used as a tool to gain a systems-level understanding of complex, biological processes. Numerical approaches that have been used to highlight influential genes include abundance, differential expression, differential variation, network connectivity and differential connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we aimed to assess a complementary strategy, namely whether the entire shape of the distribution of genome-wide co-expression values contains a meaningful biological signal that has hitherto remained hidden from view. We have developed a computational pipeline to assign one of 8 distributions (including normal, skewed, bimodal, kurtotic, inverted) to every gene. We then used a hypergeometric enrichment process to determine if particular genes (regulators versus non-regulators) and properties (differentially expressed or not) tend to be associated with particular distributions greater than would be expected by chance. Examination of several distinct data sets spanning 4 species indicates that there is indeed an additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches.Author summaryHigh-throughput technologies, such as RNA-Seq, enables access to a vast amount of data. Here, we describe a new approach to interrogate these data and extract further information to help researchers to understand complex phenotypes. Our method is based on gene-level co-expression distributions which were compared to eight possible template shapes to group genes with similar behaviours. The method was tested using five different datasets and the consistency of the results indicate it can be used as a complementary strategy to analyse transcriptomic data. |
Databáze: | OpenAIRE |
Externí odkaz: |