An MCDM approach for Reverse vaccinology model to predict bacterial protective antigens.
Autor: | Angaitkar P; Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India. Electronic address: pgangaitkar@gmail.com., Ram Janghel R; Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India. Electronic address: rrjanghel.it@nitrr.ac.in., Prasad Sahu T; Department of Information Technology, National Institute of Technology, Raipur, G.E.Road Raipur, C.G. -492010, India. Electronic address: tpsahu.it@nitrr.ac.in. |
---|---|
Jazyk: | angličtina |
Zdroj: | Vaccine [Vaccine] 2024 Jul 11; Vol. 42 (18), pp. 3874-3882. Date of Electronic Publication: 2024 May 03. |
DOI: | 10.1016/j.vaccine.2024.04.078 |
Abstrakt: | Reverse vaccinology (RV) is a significant step in sensible vaccine design. In recent years, many machine learning (ML) methods have been used to improve RV prediction accuracy. However, there are still issues with prediction accuracy and programme accessibility in ML-based RV. This paper presents a supervised ML-based method to classify bacterial protective antigens (BPAgs) and identify the model(s) that consistently perform well for the training dataset. Six ML classifiers are used for testing with physiochemical features extracted from a comprehensive training dataset. Selecting the best performing model from different performance metrics (accuracy, precision, recall, F1-score, and AUC-ROC) has not been easy, because all the metrics has the same importance to predict BPAgs. To fix this issue, we propose a soft and hard ranking model based on multi-criteria decision-making (MCDM) approach for selecting the best performing ML method that classifies BPAgs. First, our proposed model uses homologous proteins (positive and negative samples) from Protegen and Uniprot databases. Second, we applied four strategies of Synthetic Minority Oversampling Technique and Edited Nearest Neighbour (SMOTE-ENN) to handle the data imbalance problem and train the model using ML methods. Third, we consider MCDM-based technique for order preference by similarity to the ideal solution (TOPSIS) method integrated with soft and hard ranking model. The entropy is used to obtain weighted evaluation criteria for ranking the models. Our experimental evaluations show that the proposed method with best performing models (Random Forest and Extreme Gradient Boosting) outperforms compared to existing open-source RV methods using benchmark datasets. Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. (Copyright © 2024. Published by Elsevier Ltd.) |
Databáze: | MEDLINE |
Externí odkaz: |