Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
Autor: | de Oliveira, Ewerton Cristhian Lima, Santana, Kauê, Josino, Luiz, Lima e Lima, Anderson Henrique, de Souza de Sales Júnior, Claudomiro |
---|---|
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
Support Vector Machine Computer science Science Protein Data Bank (RCSB PDB) Cell-Penetrating Peptides Machine learning computer.software_genre Article Machine Learning 03 medical and health sciences symbols.namesake 0302 clinical medicine Drug Delivery Systems Classifier (linguistics) Lipid bilayer Gaussian process Multidisciplinary Artificial neural network business.industry Computational Biology Models Theoretical Chemical space Computational biology and bioinformatics Support vector machine 030104 developmental biology Membrane 030220 oncology & carcinogenesis symbols Medicine Artificial intelligence Neural Networks Computer business computer Algorithm Software Algorithms Forecasting |
Zdroj: | Scientific Reports Scientific Reports, Vol 11, Iss 1, Pp 1-15 (2021) |
ISSN: | 2045-2322 |
Popis: | Cell-penetrating peptides (CPPs) are naturally able to cross the lipid bilayer membrane that protects cells. These peptides share common structural and physicochemical properties and show different pharmaceutical applications, among which drug delivery is the most important. Due to their ability to cross the membranes by pulling high-molecular-weight polar molecules, they are termed Trojan horses. In this study, we proposed a machine learning (ML)-based framework named BChemRF-CPPred (beyondchemicalrules-basedframework forCPP prediction) that uses an artificial neural network, a support vector machine, and a Gaussian process classifier to differentiate CPPs from non-CPPs, using structure- and sequence-based descriptors extracted from PDB and FASTA formats. The performance of our algorithm was evaluated by tenfold cross-validation and compared with those of previously reported prediction tools using an independent dataset. The BChemRF-CPPred satisfactorily identified CPP-like structures using natural and synthetic modified peptide libraries and also obtained better performance than those of previously reported ML-based algorithms, reaching the independent test accuracy of 90.66% (AUC = 0.9365) for PDB, and an accuracy of 86.5% (AUC = 0.9216) for FASTA input. Moreover, our analyses of the CPP chemical space demonstrated that these peptides break some molecular rules related to the prediction of permeability of therapeutic molecules in cell membranes. This is the first comprehensive analysis to predict synthetic and natural CPP structures and to evaluate their chemical space using an ML-based framework. Our algorithm is freely available for academic use at http://comptools.linc.ufpa.br/BChemRF-CPPred. |
Databáze: | OpenAIRE |
Externí odkaz: |