A hybrid CP/MOLS approach for multi-objective imbalanced classification

Autor: Christophe Lecoutre, Laetitia Jourdan, Nicolas Szczepanski, Gilles Audemard, Nadarajen Veerapen, Lucien Mousin
Přispěvatelé: Operational Research, Knowledge And Data (ORKAD), Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Centre de Recherche en Informatique de Lens (CRIL), Université d'Artois (UA)-Centre National de la Recherche Scientifique (CNRS), Faculté de gestion, économie et sciences [UCL, Lille] (FGES), Université catholique de Lille (UCL), Université Catholique de Lille - Faculté de gestion, économie et sciences (FGES), Institut Catholique de Lille (ICL), Université catholique de Lille (UCL)-Université catholique de Lille (UCL)
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: GECCO '21: Genetic and Evolutionary Computation Conference
GECCO '21: Genetic and Evolutionary Computation Conference, Jul 2021, Lille France, France. pp.723-731, ⟨10.1145/3449639.3459310⟩
GECCO
DOI: 10.1145/3449639.3459310⟩
Popis: In the domain of partial classification, recent studies about multiobjective local search (MOLS) have led to new algorithms offering high performance, particularly when the data are imbalanced. In the presence of such data, the class distribution is highly skewed and the user is often interested in the least frequent class. Making further improvements certainly requires exploiting complementary solving techniques (notably, for the rule mining problem). As Constraint Programming (CP) has been shown to be effective on various combinatorial problems, it is one such promising complementary approach. In this paper, we propose a new hybrid combination, based on MOLS and CP that are quite orthogonal. Indeed, CP is a complete approach based on powerful filtering techniques whereas MOLS is an incomplete approach based on Pareto dominance. Experimental results on real imbalanced datasets show that our hybrid approach is statistically more efficient than a simple MOLS algorithm on both training and tests instances, in particular, on partial classification problems containing many attributes.
Databáze: OpenAIRE