Parasitism – Predation algorithm (PPA): A novel approach for feature selection

Autor: Al-Attar A. Mohamed, S.A. Hassan, A.M. Hemeida, Salem Alkhalaf, M.M.M. Mahmoud, Ayman M. Baha Eldin
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Ain Shams Engineering Journal, Vol 11, Iss 2, Pp 293-308 (2020)
Druh dokumentu: article
ISSN: 2090-4479
DOI: 10.1016/j.asej.2019.10.004
Popis: Maximizing the classification accuracy and minimizing the number of selected features are the two main incompatible objectives for using feature selection to overcome the curse of dimensionality. “Classification accuracy highly dependents on the nature of the features in a dataset which may contain irrelevant or redundant data. The main aim of feature selection is to eliminate these types of features to enhance the classification accuracy.” This work presents a new meta-heuristic optimization approach, called Parasitism-Predation Algorithm (PPA), which mimics the interaction between the predator (cats), the parasite (cuckoos) and the host (crows) in the crow–cuckoo–cat system model to overcome the problems of low convergence and the curse of dimensionality of large data. The proposed hybrid framework combines the relative advantages of cat swarm optimization (CSO), cuckoo search (CS) and crow search algorithm (CSA) to attain a combinatorial set of features to boost up the classification accuracy. Nesting, parasitism, and predation phases are supposed to help exploration ability and balance in the context of solving classification problems. In addition, Levy flight distribution is applied to help better diversity of conventional CSA and improve ability of exploration. Meanwhile, an effective fitness function is utilized to enable the proposed PPA-based feature selector using K-Nearest Neighbors algorithm (KNN) to attain a combinatorial set of features. The proposed PPA and four standard heuristic search algorithms are looked at to gauge how efficient the proposed option is. Additionally, eighteen classification datasets are deployed to gauges its efficacy. The results highlight that the algorithm proposed is both effective and competitive in terms of performance of classification and dimensionality reduction as opposed to other heuristic options.
Databáze: Directory of Open Access Journals