PPIscreenML: Structure-based screening for protein-protein interactions using AlphaFold.

Autor: Mischley V; Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia PA 19111.; Molecular Cell Biology and Genetics, Drexel University, Philadelphia PA 19102., Maier J; Triana Biomedicines, Lexington MA 02421., Chen J; Triana Biomedicines, Lexington MA 02421., Karanicolas J; Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia PA 19111.; Moulder Center for Drug Discovery Research, Temple University School of Pharmacy, Philadelphia PA 19140.
Jazyk: angličtina
Zdroj: BioRxiv : the preprint server for biology [bioRxiv] 2024 Apr 30. Date of Electronic Publication: 2024 Apr 30.
DOI: 10.1101/2024.03.16.585347
Abstrakt: Protein-protein interactions underlie nearly all cellular processes. With the advent of protein structure prediction methods such as AlphaFold2 (AF2), models of specific protein pairs can be built extremely accurately in most cases. However, determining the relevance of a given protein pair remains an open question. It is presently unclear how to use best structure-based tools to infer whether a pair of candidate proteins indeed interact with one another: ideally, one might even use such information to screen amongst candidate pairings to build up protein interaction networks. Whereas methods for evaluating quality of modeled protein complexes have been co-opted for determining which pairings interact (e.g., pDockQ and iPTM), there have been no rigorously benchmarked methods for this task. Here we introduce PPIscreenML, a classification model trained to distinguish AF2 models of interacting protein pairs from AF2 models of compelling decoy pairings. We find that PPIscreenML out-performs methods such as pDockQ and iPTM for this task, and further that PPIscreenML exhibits impressive performance when identifying which ligand/receptor pairings engage one another across the structurally conserved tumor necrosis factor superfamily (TNFSF). Analysis of benchmark results using complexes not seen in PPIscreenML development strongly suggest that the model generalizes beyond training data, making it broadly applicable for identifying new protein complexes based on structural models built with AF2.
Databáze: MEDLINE