SEMgraph: An R Package for Causal Network Analysis of High-Throughput Data with Structural Equation Models

Autor: Palluzzi, Fernando, Grassi, Mario
Rok vydání: 2021
Předmět:
Druh dokumentu: Working Paper
DOI: 10.1093/bioinformatics/btac567
Popis: With the advent of high-throughput sequencing (HTS) in molecular biology and medicine, the need for scalable statistical solutions for modeling complex biological systems has become of critical importance. The increasing number of platforms and possible experimental scenarios raised the problem of integrating large amounts of new heterogeneous data and current knowledge, to test novel hypotheses and improve our comprehension of physiological processes and diseases. Although network theory provided a framework to represent biological systems and study their hidden properties, different algorithms still offer low reproducibility and robustness, dependence on user-defined setup, and poor interpretability. Here we discuss the R package SEMgraph, combining network analysis and causal inference within the framework of structural equation modeling (SEM). It provides a fully automated toolkit, managing complex biological systems as multivariate networks, ensuring robustness and reproducibility through data-driven evaluation of model architecture and perturbation, that is readily interpretable in terms of causal effects among system components. In addition, SEMgraph offers several functions for perturbed path finding, model reduction, and parallelization options for the analysis of large interaction networks.
Comment: 29 pages; 5 figures; original article; R package; CRAN stable version at: https://CRAN.R-project.org/package=SEMgraph; Development version available at https://github.com/fernandoPalluzzi/SEMgraph
Databáze: arXiv