Feature selection with ensemble learning using enriched SOM

Autor: Najet Arous, Ameni Filali, Chiraz Jlassi
Rok vydání: 2017
Předmět:
Zdroj: International Journal of Intelligent Systems Technologies and Applications. 16:208
ISSN: 1740-8873
1740-8865
DOI: 10.1504/ijista.2017.085357
Popis: Finding pertinent subspaces in very high-dimensional dataset is a challenging task. The selection of features should be stable, but on the other hand clustering results have to be enhanced. Ensemble methods have successfully increased the stability and clustering accuracy, but their runtime prevents them from scaling up to real-world applications. This paper treats the problem of selecting a subset of the most relevant features for each cluster from a dataset. The proposed model is an extension of the random forests method using enriched self-organising map (SOM) to unlabelled data that assess the out-of-bag (oob) feature importance from an ensemble of partitions. Each partition is produced using a different bootstrap sample and a random subset of the features. We then assessed the accuracy and the scalability of the proposed method on 19 benchmark datasets and we compared its effectiveness against other unsupervised feature selection methods with ensemble learning.
Databáze: OpenAIRE