Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite

Autor: Océane Cassan, Sophie Lèbre, Antoine Martin
Přispěvatelé: Biochimie et Physiologie Moléculaire des Plantes (BPMP), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Institut Montpelliérain Alexander Grothendieck (IMAG), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2021
Předmět:
0106 biological sciences
[SDV]Life Sciences [q-bio]
Gene regulatory network
Inference
Multifactorial transcriptomic analysis
Biology
Ontology (information science)
QH426-470
computer.software_genre
01 natural sciences
MESH: Gene Expression Profiling
MESH: Software
03 medical and health sciences
Model-based clustering
Gene regulatory network inference
Genetics
Cluster Analysis
[SDV.BV]Life Sciences [q-bio]/Vegetal Biology
Analysis workflow
Gene Regulatory Networks
Cluster analysis
MESH: Gene Regulatory Networks
030304 developmental biology
0303 health sciences
Gene Expression Profiling
MESH: Transcriptome
Computational Biology
MESH: Cluster Analysis
Random forest
Graphical user interface
Workflow
Data mining
User interface
Web service
Transcriptome
computer
Software
TP248.13-248.65
MESH: Computational Biology
010606 plant biology & botany
Biotechnology
Zdroj: BMC Genomics, Vol 22, Iss 1, Pp 1-15 (2021)
BMC Genomics
BMC Genomics, BioMed Central, 2021, 22 (1), pp.387. ⟨10.1186/s12864-021-07659-2⟩
ISSN: 1471-2164
DOI: 10.1186/s12864-021-07659-2⟩
Popis: Background High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. Results We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. Conclusions We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service (https://diane.bpmp.inrae.fr), or can be installed and locally launched as a complete R package.
Databáze: OpenAIRE