Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning

Autor: Silvia Crivelli, Shokoufeh Mirzaei, Tomer Sidi, Chen Keasar
Rok vydání: 2019
Předmět:
Zdroj: IEEE/ACM Transactions on Computational Biology and Bioinformatics. 16:1515-1523
ISSN: 2374-0043
1545-5963
DOI: 10.1109/tcbb.2016.2602269
Popis: The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. Selection of the best quality decoys is both challenging and essential as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Databáze: OpenAIRE