Genome-Wide Co-Expression Distributions as a Metric to Prioritize Genes of Functional Importance

Autor: Nicholas J. Hudson, Sigrid A. Lehnert, Marina Naval-Sanchez, Loan T. Nguyen, Laercio R. Porto-Neto, Antonio Reverter, Marina R. S. Fortes, Pâmela A. Alexandre
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Genes, Vol 11, Iss 1231, p 1231 (2020)
Genes
Volume 11
Issue 10
ISSN: 2073-4425
Popis: Genome-wide gene expression is routinely used as a tool to gain a systems-level understanding of complex, biological processes. Numerical approaches that have been used to highlight influential genes include abundance, differential expression, differential variation, network connectivity and differential connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we aimed to assess a complementary strategy, namely whether the entire shape of the distribution of genome-wide co-expression values contains a meaningful biological signal that has hitherto remained hidden from view. We have developed a computational pipeline to assign one of 8 distributions (including normal, skewed, bimodal, kurtotic, inverted) to every gene. We then used a hypergeometric enrichment process to determine if particular genes (regulators versus non-regulators) and properties (differentially expressed or not) tend to be associated with particular distributions greater than would be expected by chance. Examination of several distinct data sets spanning 4 species indicates that there is indeed an additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches.Author summaryHigh-throughput technologies, such as RNA-Seq, enables access to a vast amount of data. Here, we describe a new approach to interrogate these data and extract further information to help researchers to understand complex phenotypes. Our method is based on gene-level co-expression distributions which were compared to eight possible template shapes to group genes with similar behaviours. The method was tested using five different datasets and the consistency of the results indicate it can be used as a complementary strategy to analyse transcriptomic data.
Databáze: OpenAIRE