Decoding the Functional Interactome of Non-Model Organisms with PHILHARMONIC.

Autor: Sledzieski S; Center for Computational Biology, Flatiron Institute, New York, NY, USA., Versavel C; Department of Computer Science, Tufts University, Medford MA, USA., Singh R; Departments of Biostatistics & Bioinformatics and Cell Biology, Duke University, Durham, NC, USA., Ocitti F; Department of Computer Science, Tufts University, Medford MA, USA., Devkota K; Departments of Biostatistics & Bioinformatics and Cell Biology, Duke University, Durham, NC, USA., Kumar L; Shoolini University, Solan, Himachal Pradesh-173229India., Shpilker P; Department of Computer Science, Tufts University, Medford MA, USA., Roger L; School of Molecular Sciences, Arizona State University, Phoenix, AZ, USA., Yang J; Department of Mechanical Engineering, Seoul National University, Seoul, South Korea., Lewinski N; Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, USA., Putnam H; Department of Biological Sciences, University of Rhode Island, Kingston, RI, USA., Berger B; Computer Science & Artificial Intelligence Laboratory and Department of Mathematics, MIT Cambridge, MA, USA., Klein-Seetharaman J; School of Molecular Sciences, Arizona State University, Phoenix, AZ, USA., Cowen L; Department of Computer Science, Tufts University, Medford MA, USA.
Jazyk: angličtina
Zdroj: BioRxiv : the preprint server for biology [bioRxiv] 2024 Oct 29. Date of Electronic Publication: 2024 Oct 29.
DOI: 10.1101/2024.10.25.620267
Abstrakt: Protein-protein interaction (PPI) networks are a fundamental resource for modeling cellular and molecular function, and a large and sophisticated toolbox has been developed to leverage their structure and topological organization to predict the functional roles of under-studied genes, proteins, and pathways. However, the overwhelming majority of experimentally-determined interactions from which such networks are constructed come from a small number of well-studied model organisms. Indeed, most species lack even a single experimentally-determined interaction in these databases, much less a network to enable the analysis of cellular function, and methods for computational PPI prediction are too noisy to apply directly. We introduce PHILHARMONIC, a novel computational approach that couples deep learning de novo network inference with robust unsupervised spectral clustering algorithms to uncover functional relationships and high-level organization in non-model organisms. Our clustering approach allows us to de-noise the predicted network, producing highly informative functional modules. We also develop a novel algorithm called ReCIPE, which aims to reconnect disconnected clusters, increasing functional enrichment and biological interpretability. We perform remote homology-based functional annotation by leveraging hmmscan and GODomainMiner to assign initial functions to proteins at large evolutionary distances. Our clusters enable us to newly assign functions to uncharacterized proteins through "function by association." We demonstrate the ability of PHILHARMONIC to recover clusters with significant functional coherence in the reef-building coral P. damicornis , its algal symbiont C. goreaui , and the well-annotated fruit fly D. melanogaster . We perform a deeper analysis of the P. damicornis network, where we show that PHILHARMONIC clusters correlate strongly with gene co-expression and investigate several clusters that participate in temperature regulation in the coral, including the first putative functional annotation of several previously uncharacterized proteins. Easy to run end-to-end and requiring only a sequenced proteome, PHILHARMONIC is an engine for biological hypothesis generation and discovery in non-model organisms. PHILHARMONIC is available at https://github.com/samsledje/philharmonic.
Databáze: MEDLINE