On the Optimality of Multi-Label Classification under Subset Zero-One Loss for Distributions Satisfying the Composition Property
Autor: | Gasse, Maxime, Aussem, Alex, Elghazel, Haytham |
---|---|
Přispěvatelé: | Gasse, Maxime, Integrated Solutions for Agile Manufacturing in High-mix Semiconductor Fabs - INTEGRATE - - EC:FP7:SP1-JTI2013-01-01 - 2015-12-31 - 324271 - VALID, Francis R. Bach and David M. Blei, Data Mining and Machine Learning (DM2L), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS), European Project: 324271,EC:FP7:SP1-JTI,ENIAC-2012-1,INTEGRATE(2013), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2) |
Jazyk: | angličtina |
Rok vydání: | 2015 |
Předmět: | |
Zdroj: | Proceedings of the 32nd International Conference on Machine Learning International Conference on Machine Learning International Conference on Machine Learning, Jul 2015, Lille, France. pp.2531--2539 |
Popis: | International audience; The benefit of exploiting label dependence in multi-label classification is known to be closely dependent on the type of loss to be minimized. In this paper, we show that the subsets of labels that appear as irreducible factors in the factor-ization of the conditional distribution of the label set given the input features play a pivotal role for multi-label classification in the context of 0/1 loss minimization, as they divide the learning task into simpler independent multi-class problems. We establish theoretical results to characterize and identify these irreducible label factors for any given probability distribution satisfying the Composition property. The analysis lays the foundation for generic multi-label classification and optimal feature subset selection procedures under this subclass of distributions. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data. |
Databáze: | OpenAIRE |
Externí odkaz: |