Online Feature Selection by Adaptive Sub-gradient Methods

Autor:	Hao Wang, Yang Gao, Tingting Zhai, Frédéric Koriche
Přispěvatelé:	Faculty of Physics and Electronic Science (PES), Hubei University of Science and Technology, Centre de Recherche en Informatique de Lens (CRIL), Université d'Artois (UA)-Centre National de la Recherche Scientifique (CNRS), China Information Technology Security Evaluation Center
Rok vydání:	2019
Předmět:	Computer science Online learning Mirror descent Feature selection Regret 02 engineering and technology Regularization (mathematics) [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] 020204 information systems Streaming data 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Algorithm ComputingMilieux_MISCELLANEOUS
Zdroj:	Machine Learning and Knowledge Discovery in Databases ISBN: 9783030109271 ECML/PKDD (2) Machine Learning and Knowledge Discovery in Databases-European Conference (ECML-PKDD) Machine Learning and Knowledge Discovery in Databases-European Conference (ECML-PKDD), 2018, Unknown, Ireland
DOI:	10.1007/978-3-030-10928-8_26
Popis:	The overall goal of online feature selection is to iteratively select, from high-dimensional streaming data, a small, “budgeted” number of features for constructing accurate predictors. In this paper, we address the online feature selection problem using novel truncation techniques for two online sub-gradient methods: Adaptive Regularized Dual Averaging (ARDA) and Adaptive Mirror Descent (AMD). The corresponding truncation-based algorithms are called B-ARDA and B-AMD, respectively. The key aspect of our truncation techniques is to take into account the magnitude of feature values in the current predictor, together with their frequency in the history of predictions. A detailed regret analysis for both algorithms is provided. Experiments on six high-dimensional datasets indicate that both B-ARDA and B-AMD outperform two advanced online feature selection algorithms, OFS and SOFS, especially when the number of selected features is small. Compared to sparse online learning algorithms that use \(\ell _1\) regularization, B-ARDA is superior to \(\ell _1\)-ARDA, and B-AMD is superior to Ada-Fobos. Code related to this paper is available at: https://github.com/LUCKY-ting/online-feature-selection.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ba115cd5d7e644b6d1373d7a24d86e7f https://doi.org/10.1007/978-3-030-10928-8_26 Zobrazit plný text záznamu