A Relaxation-based Approach for Mining Diverse Closed Patterns

Autor: Noureddine Aribi, Abdelkader Ouali, Yahia Lebbah, Samir Loudni, Arnold Hien, Albrecht Zimmermann, Mohammed El Amine Laghzaoui
Přispěvatelé: Equipe CODAG - Laboratoire GREYC - UMR6072, Groupe de Recherche en Informatique, Image et Instrumentation de Caen (GREYC), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU)-Normandie Université (NU)-Université de Caen Normandie (UNICAEN), Normandie Université (NU)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU), Département Automatique, Productique et Informatique (IMT Atlantique - DAPI), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Théorie, Algorithmes et Systèmes en Contraintes (TASC ), Laboratoire des Sciences du Numérique de Nantes (LS2N), Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Laboratoire d'Informatique et Technologies de l'Information d'Oran (LITIO), Université d'Oran Al-Sanya, Frank Hutter, Kristian Kersting, Jefrey Lijffijt, Isabel Valera
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Machine Learning and Knowledge Discovery in Databases-European Conference, {ECML} {PKDD} 2020
Machine Learning and Knowledge Discovery in Databases-European Conference, 2020, Sep 2020, Ghent (virtual), Belgium. pp.36--54, ⟨10.1007/978-3-030-67658-2_3⟩
HAL
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020, Sep 2020, Gand, Belgium
Machine Learning and Knowledge Discovery in Databases ISBN: 9783030676575
ECML/PKDD (1)
Popis: International audience; In recent years, pattern mining has moved from a slow-moving repeated three-step process to a much more agile iterative/user-centric mining model. A vital ingredient of this framework is the ability to quickly present a set of diverse patterns to the user. In this paper, we use constraint programming (well-suited to user-centric mining due to its rich constraint language) to efficiently mine a diverse set of closed patterns. Diversity is controlled through a threshold on the Jaccard similarity of pattern occurrences. We show that the Jaccard measure has no monotonicity property, which prevents usual pruning techniques and makes classical pattern mining unworkable. This is why we propose anti-monotonic lower and upper bound relaxations, which allow effective pruning, with an efficient branching rule, boosting the whole search process. We show experimentally that our approach significantly reduces the number of patterns and is very efficient in terms of running times, particularly on dense data sets.
Databáze: OpenAIRE