A ranking-based feature selection for multi-label classification with fuzzy relative discernibility
Autor: | Yinglong Wang, Wenbin Qian, Chuanzhen Xiong |
---|---|
Rok vydání: | 2021 |
Předmět: |
Multi-label classification
0209 industrial biotechnology business.industry Computer science Feature vector Pattern recognition Feature selection 02 engineering and technology Fuzzy logic ComputingMethodologies_PATTERNRECOGNITION 020901 industrial engineering & automation Ranking 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business Classifier (UML) Software Curse of dimensionality |
Zdroj: | Applied Soft Computing. 102:106995 |
ISSN: | 1568-4946 |
DOI: | 10.1016/j.asoc.2020.106995 |
Popis: | Feature selection is a crucial pre-processing step for learning tasks to mitigate the “curse of dimensionality”, which is caused by irrelevant and redundant features in high-dimensional feature space. The fuzzy rough set is an effective tool in feature selection, and the fuzzy discernibility matrix, one of the generalized model of the fuzzy rough set, has attracted significant attention recently. However, with the complicated discrimination relation of multiple labels, the traditional fuzzy discernibility matrix does not fit well for multi-label data. To address this problem, in this paper, the fuzzy label discernibility relation and the fuzzy relative discernibility relation are defined firstly, which are derived from fuzzy discernibility relation. Subsequently, we present the feature discernibility significance which measures the discernibility ability of the conditional feature and selects the most relevant features using the value of discernibility significance. On this basis, we propose a ranking-based feature selection algorithm for multi-label classification with fuzzy relative discernibility. Finally, in terms of six widely-accepted multi-label evaluation metrics, a series of experiments is conducted based on the multi-label classifier with ten different multi-label datasets from Mulan library and MLL resource, to compare the performance of the proposed algorithm with four state-of-the-art multi-label feature selection algorithms. The experimental results demonstrate the superiority and effectiveness of the proposed algorithm in multi-label classification. |
Databáze: | OpenAIRE |
Externí odkaz: |