Feature selection with ensemble learning using enriched SOM
Autor: | Najet Arous, Ameni Filali, Chiraz Jlassi |
---|---|
Rok vydání: | 2017 |
Předmět: |
General Computer Science
Computer science business.industry Stability (learning theory) k-means clustering Feature selection Pattern recognition computer.software_genre Ensemble learning Random forest ComputingMethodologies_PATTERNRECOGNITION Feature (computer vision) Unsupervised learning Data mining Artificial intelligence business Cluster analysis computer |
Zdroj: | International Journal of Intelligent Systems Technologies and Applications. 16:208 |
ISSN: | 1740-8873 1740-8865 |
DOI: | 10.1504/ijista.2017.085357 |
Popis: | Finding pertinent subspaces in very high-dimensional dataset is a challenging task. The selection of features should be stable, but on the other hand clustering results have to be enhanced. Ensemble methods have successfully increased the stability and clustering accuracy, but their runtime prevents them from scaling up to real-world applications. This paper treats the problem of selecting a subset of the most relevant features for each cluster from a dataset. The proposed model is an extension of the random forests method using enriched self-organising map (SOM) to unlabelled data that assess the out-of-bag (oob) feature importance from an ensemble of partitions. Each partition is produced using a different bootstrap sample and a random subset of the features. We then assessed the accuracy and the scalability of the proposed method on 19 benchmark datasets and we compared its effectiveness against other unsupervised feature selection methods with ensemble learning. |
Databáze: | OpenAIRE |
Externí odkaz: |