Feature Selection via Mutual Information: New Theoretical Insights
Autor: | Mario Beraha, Alberto Maria Metelli, Andrea Tirinzoni, Marcello Restelli, Matteo Papini |
---|---|
Přispěvatelé: | Beraha, Mario, Metelli, Alberto Maria, Papini, Matteo, Tirinzoni, Andrea, Restelli, Marcello |
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer science Machine Learning (stat.ML) Feature selection 02 engineering and technology computer.software_genre supervised learning Machine Learning (cs.LG) feature selection Redundancy (information theory) Statistics - Machine Learning 0202 electrical engineering electronic engineering information engineering mutual information Mutual information feature selection conditional mutual information Conditional mutual information Supervised learning classification machine learning regression 020206 networking & telecommunications Mutual information Regression Bounded function 020201 artificial intelligence & image processing Data mining computer |
Zdroj: | IJCNN |
Popis: | Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However, existing algorithms are mostly heuristic and do not offer any guarantee on the proposed solution. In this paper, we provide novel theoretical results showing that conditional mutual information naturally arises when bounding the ideal regression/classification errors achieved by different subsets of features. Leveraging on these insights, we propose a novel stopping condition for backward and forward greedy methods which ensures that the ideal prediction error using the selected feature subset remains bounded by a user-specified threshold. We provide numerical simulations to support our theoretical claims and compare to common heuristic methods. Accepted for presentation at the International Joint Conference on Neural Networks (IJCNN) 2019 |
Databáze: | OpenAIRE |
Externí odkaz: |