Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inference

Autor: Andreas Tjärnberg, Maggie Beheler-Amass, Christopher A. Jackson, Lionel A. Christiaen, David Gresham, Richard Bonneau
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Genome Biology, Vol 25, Iss 1, Pp 1-28 (2024)
Druh dokumentu: article
ISSN: 1474-760X
30558255
DOI: 10.1186/s13059-023-03134-1
Popis: Abstract Background Modeling of gene regulatory networks (GRNs) is limited due to a lack of direct measurements of genome-wide transcription factor activity (TFA) making it difficult to separate covariance and regulatory interactions. Inference of regulatory interactions and TFA requires aggregation of complementary evidence. Estimating TFA explicitly is problematic as it disconnects GRN inference and TFA estimation and is unable to account for, for example, contextual transcription factor-transcription factor interactions, and other higher order features. Deep-learning offers a potential solution, as it can model complex interactions and higher-order latent features, although does not provide interpretable models and latent features. Results We propose a novel autoencoder-based framework, StrUcture Primed Inference of Regulation using latent Factor ACTivity (SupirFactor) for modeling, and a metric, explained relative variance (ERV), for interpretation of GRNs. We evaluate SupirFactor with ERV in a wide set of contexts. Compared to current state-of-the-art GRN inference methods, SupirFactor performs favorably. We evaluate latent feature activity as an estimate of TFA and biological function in S. cerevisiae as well as in peripheral blood mononuclear cells (PBMC). Conclusion Here we present a framework for structure-primed inference and interpretation of GRNs, SupirFactor, demonstrating interpretability using ERV in multiple biological and experimental settings. SupirFactor enables TFA estimation and pathway analysis using latent factor activity, demonstrated here on two large-scale single-cell datasets, modeling S. cerevisiae and PBMC. We find that the SupirFactor model facilitates biological analysis acquiring novel functional and regulatory insight.
Databáze: Directory of Open Access Journals