Quick and Robust Feature Selection

Autor:	Decebal Constantin Mocanu, Elena Mocanu, Raymond N.J. Veldhuis, Tim van der Lee, Mykola Pechenizkiy, Ghada Sokar, Zahra Atashgahi
Přispěvatelé:	Data Mining, EAISI Health, EAISI Foundational, Electro-Optical Communication, Electrical Energy Systems, Datamanagement & Biometrics
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Sparse autoencoders Computer science cs.LG Machine Learning (stat.ML) Feature selection Machine learning computer.software_genre Machine Learning (cs.LG) Reduction (complexity) Statistics - Machine Learning Artificial Intelligence SDG 7 - Affordable and Clean Energy Cluster analysis Sparse training Artificial neural network business.industry Deep learning Autoencoder stat.ML Feature (computer vision) Benchmark (computing) Artificial intelligence business computer Software SDG 7 – Betaalbare en schone energie
Zdroj:	arXiv, 2020:2012.00560. Cornell University Library Machine Learning, 111(1), 377-414. Springer Machine Learning, 111. Springer
ISSN:	2331-8422 0885-6125
Popis:	Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a dataset, has been introduced as a solution to this problem. Most of the existing feature selection methods are computationally inefficient; inefficient algorithms lead to high energy consumption, which is not desirable for devices with limited computational and energy resources. In this paper, a novel and flexible method for unsupervised feature selection is proposed. This method, named QuickSelection, introduces the strength of the neuron in sparse neural networks as a criterion to measure the feature importance. This criterion, blended with sparsely connected denoising autoencoders trained with the sparse evolutionary training procedure, derives the importance of all input features simultaneously. We implement QuickSelection in a purely sparse manner as opposed to the typical approach of using a binary mask over connections to simulate sparsity. It results in a considerable speed increase and memory reduction. When tested on several benchmark datasets, including five low-dimensional and three high-dimensional datasets, the proposed method is able to achieve the best trade-off of classification and clustering accuracy, running time, and maximum memory usage, among widely used approaches for feature selection. Besides, our proposed method requires the least amount of energy among the state-of-the-art autoencoder-based feature selection methods. Comment: 29 pages
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f97c58d04ad4d7f98d6d0c7f021d6ea8 https://research.tue.nl/en/publications/8dd30b63-386c-495f-8ce1-fdb226b01a78 Zobrazit plný text záznamu Full text from SpringerLink