Identifying Stress Responsive Genes using Overlapping Communities in Co-expression Networks

Autor: Camilo Rocha, Jorge Finke, Camila Riccio-Rengifo
Rok vydání: 2020
Předmět:
FOS: Computer and information sciences
Computer Science - Machine Learning
Salinity
Genotype
QH301-705.5
Molecular Networks (q-bio.MN)
Computer applications to medicine. Medical informatics
R858-859.7
Oryza sativa
Computational biology
LASSO
Biology
Biochemistry
Machine Learning (cs.LG)
Lasso (statistics)
Stress
Physiological

Structural Biology
Quantitative Biology - Molecular Networks
Biology (General)
Cluster analysis
Molecular Biology
Gene
Social and Information Networks (cs.SI)
Stress-responsive genes
Co-expression network
Sequence Analysis
RNA

Methodology Article
Applied Mathematics
Phenotypic traits
food and beverages
Computer Science - Social and Information Networks
Oryza
Salt Tolerance
Phenotypic trait
Overlapping communities
Phenotype
Computer Science Applications
Workflow
FOS: Biological sciences
Rice
DNA microarray
Zdroj: BMC Bioinformatics, Vol 22, Iss 1, Pp 1-17 (2021)
BMC Bioinformatics
DOI: 10.48550/arxiv.2011.03526
Popis: Background This paper proposes a workflow to identify genes that respond to specific treatments in plants. The workflow takes as input the RNA sequencing read counts and phenotypical data of different genotypes, measured under control and treatment conditions. It outputs a reduced group of genes marked as relevant for treatment response. Technically, the proposed approach is both a generalization and an extension of WGCNA. It aims to identify specific modules of overlapping communities underlying the co-expression network of genes. Module detection is achieved by using Hierarchical Link Clustering. The overlapping nature of the systems’ regulatory domains that generate co-expression can be identified by such modules. LASSO regression is employed to analyze phenotypic responses of modules to treatment. Results The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress. The workflow identifies 19 rice genes that seem relevant in the response to salt stress. They are distributed across 6 modules: 3 modules, each grouping together 3 genes, are associated to shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1 module of 4 genes is associated to root biomass. These genes represent target genes for the improvement of salinity tolerance in rice. Conclusions A more effective framework to reduce the search-space for target genes that respond to a specific treatment is introduced. It facilitates experimental validation by restraining efforts to a smaller subset of genes of high potential relevance.
Databáze: OpenAIRE