Robust Principal Component Analysis-based Prediction of Protein-Protein Interaction Hot spots ( {RBHS} )
Autor: | Mercedes Alfonso-Prieto, Divya Sitani, Paolo Carloni, Alejandro Giorgetti |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Models
Molecular Computer science Pipeline (computing) protein-protein interactions Feature selection Low-rank approximation Biochemistry Protein–protein interaction 03 medical and health sciences feature selection hot spot residues Structural Biology ddc:570 Protein Interaction Mapping Humans Protein Interaction Domains and Motifs F1-score Databases Protein Molecular Biology 030304 developmental biology robust PCA (principal component analysis) 0303 health sciences Principal Component Analysis Binding Sites Noise (signal processing) 030302 biochemistry & molecular biology Computational Biology Proteins imbalanced datasets Identification (information) machine learning ROC Curve noiseless data matrices F1 score Biological system Robust principal component analysis Protein Binding |
Zdroj: | Proteins 89(6), 639-647 (2021). doi:10.1002/prot.26047 |
DOI: | 10.1002/prot.26047 |
Popis: | Proteins often exert their function by binding to other cellular partners. The hot spots are key residues for protein-protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein-protein interaction inhibitors for therapy. Unfortunately, current machine learning methods to predict hot spots, suffer from limitations caused by gross errors in the data matrices. Here, we present a novel data pre-processing pipeline that overcomes this problem by recovering a low rank matrix with reduced noise using Robust Principal Component Analysis. Application to existing databases shows the predictive power of the method. |
Databáze: | OpenAIRE |
Externí odkaz: |