An effective feature subset selection approach based on Jeffries-Matusita distance for multiclass problems

Autor: Basabi Chakraborty, Ashis Kumar Mandal, Rikta Sen, Saptarsi Goswami
Rok vydání: 2022
Předmět:
Zdroj: Journal of Intelligent & Fuzzy Systems. 42:4173-4190
ISSN: 1875-8967
1064-1246
Popis: Jeffries-Matusita (JM) distance, a transformation of the Bhattacharyya distance, is a widely used measure of the spectral separability distance between the two class density functions and is generally used as a class separability measure. It can be considered to have good potential to be used for evaluation of the effectiveness of a feature in discriminating two classes. The capability of JM distance as a ranking based feature selection technique for binary classification problems has been verified in some research works as well as in our earlier work. It was found by our simulation experiments with benchmark data sets that JM distance works equally well compared to other popular feature ranking methods based on mutual information, information gain or Relief. Extension of JM distance measure for feature ranking in multiclass problems has also been reported in the literature. But all of them are basically rank based approaches which deliver the ranking of the features and do not automatically produce the final optimal feature subset. In this work, a novel heuristic approach for finding out the optimum feature subset from JM distance based ranked feature lists for multiclass problems have been developed without explicitly using any specific search technique. The proposed approach integrates the extension of JM measure for multiclass problems and the selection of the final optimal feature subset in a unified process. The performance of the proposed algorithm has been evaluated by simulation experiments with benchmark data sets in comparison with two other previously developed multiclass JM distance measures (weighted average JM distance and another multiclass extension equivalent to Bhattacharyya bound) and some other popular filter based feature ranking algorithms. It is found that the proposed algorithm performs better in terms of classification accuracy, F-measure, AUC with a reduced set of features and computational cost.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje