Efficient feature selection using shrinkage estimators

Autor: Konstantinos Sechidis, James Weatherall, Adam Craig Pocock, Giorgio Corani, Gavin Brown, Laura Azzimonti
Rok vydání: 2019
Předmět:
Zdroj: Sechidis, K, Azzimonti, L, Pocock, A, Corani, G, Weatherall, J & Brown, G 2019, ' Efficient Feature Selection Using Shrinkage Estimators ', Machine Learning . https://doi.org/10.1007/s10994-019-05795-1
ISSN: 1573-0565
0885-6125
Popis: Information theoretic feature selection methods quantify the importanceof each feature by estimating mutual information terms to capture: therelevancy, the redundancy and the complementarity. These terms are commonlyestimated by maximum likelihood, while an under-explored area of research is how to use shrinkage methods instead. Our work suggests a novel shrinkage method for data-efficient estimation of information theoretic terms. The small sample behaviour makes it particularly suitable for estimation of discrete distributions with large number of categories (bins). Using our novel estimators we derive a framework for generating feature selection criteria that capture any high-order feature interaction for redundancy and complementarity. We perform a thorough empirical study across datasets from diverse sources and using various evaluation measures. Our first finding is that our shrinkage based methods achieve better results, while they keep the same computational cost as the simple maximum likelihood based methods. Furthermore, under our framework we derive efficient novel high-order criteria that outperform state-of-the-art methods in various tasks.
Databáze: OpenAIRE