MCFS: Min-cut-based feature-selection

Autor: José A. Troyano, Fermín L. Cruz, Fernando Enríquez, F. Javier Ortega, Carlos G. Vallejo
Rok vydání: 2020
Předmět:
Zdroj: Knowledge-Based Systems. 195:105604
ISSN: 0950-7051
DOI: 10.1016/j.knosys.2020.105604
Popis: In this paper, MCFS (Min-Cut-based feature-selection) is presented, which is a feature-selection algorithm based on the representation of the features in a dataset by means of a directed graph. The main contribution of our work is to show the usefulness of a general graph-processing technique in the feature-selection problem for classification datasets. The vertices of the graphs used herein are the features together with two special-purpose vertices (one of which denotes high correlation to the feature class of the dataset, and the other denotes a low correlation to the feature class). The edges are functions of the correlations among the features and also between the features and the classes. A classic max-flow min-cut algorithm is applied to this graph. The cut returned by this algorithm provides the selected features. We have compared the results of our proposal with well-known feature-selection techniques. Our algorithm obtains results statistically similar to those achieved by the other techniques in terms of number of features selected, while additionally significantly improving the accuracy.
Databáze: OpenAIRE