An effective feature subset selection approach based on Jeffries-Matusita distance for multiclass problems
Autor: | Basabi Chakraborty, Ashis Kumar Mandal, Rikta Sen, Saptarsi Goswami |
---|---|
Rok vydání: | 2022 |
Předmět: |
Statistics and Probability
Computer science business.industry General Engineering Pattern recognition 02 engineering and technology 01 natural sciences Artificial Intelligence Feature (computer vision) 0103 physical sciences 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence 010306 general physics business Selection (genetic algorithm) |
Zdroj: | Journal of Intelligent & Fuzzy Systems. 42:4173-4190 |
ISSN: | 1875-8967 1064-1246 |
Popis: | Jeffries-Matusita (JM) distance, a transformation of the Bhattacharyya distance, is a widely used measure of the spectral separability distance between the two class density functions and is generally used as a class separability measure. It can be considered to have good potential to be used for evaluation of the effectiveness of a feature in discriminating two classes. The capability of JM distance as a ranking based feature selection technique for binary classification problems has been verified in some research works as well as in our earlier work. It was found by our simulation experiments with benchmark data sets that JM distance works equally well compared to other popular feature ranking methods based on mutual information, information gain or Relief. Extension of JM distance measure for feature ranking in multiclass problems has also been reported in the literature. But all of them are basically rank based approaches which deliver the ranking of the features and do not automatically produce the final optimal feature subset. In this work, a novel heuristic approach for finding out the optimum feature subset from JM distance based ranked feature lists for multiclass problems have been developed without explicitly using any specific search technique. The proposed approach integrates the extension of JM measure for multiclass problems and the selection of the final optimal feature subset in a unified process. The performance of the proposed algorithm has been evaluated by simulation experiments with benchmark data sets in comparison with two other previously developed multiclass JM distance measures (weighted average JM distance and another multiclass extension equivalent to Bhattacharyya bound) and some other popular filter based feature ranking algorithms. It is found that the proposed algorithm performs better in terms of classification accuracy, F-measure, AUC with a reduced set of features and computational cost. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |