On the Refinement of Conflict History Search Through Multi-Armed Bandit

Autor: Djamal Habet, Mohamed Sami Cherif, Cyril Terrioux
Přispěvatelé: COntraintes, ALgorithmes et Applications (COALA), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), ANR-16-CE40-0028,DE-MO-GRAPH,Décomposition de Modèles Graphiques(2016)
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI)
IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), Nov 2020, Baltimore, United States. pp.264-271, ⟨10.1109/ICTAI50040.2020.00050⟩
ICTAI
DOI: 10.1109/ICTAI50040.2020.00050⟩
Popis: International audience; Reinforcement learning has shown its relevance in designing search heuristics for backtracking algorithms dedicated to solving decision problems under constraints. Recently, an efficient heuristic, called Conflict History Search (CHS), based on the history of search failures was introduced for the Constraint Satisfaction Problem (CSP). The Exponential Recency Weighted Average (ERWA) is used to estimate the hardness of constraints and CHS favors the variables that often appear in recent failures. The step parameter is important in CHS since it controls the estimation of the hardness of constraints and its refinement may lead to notable improvements. The current research aims to achieve this objective. Indeed, a Multi-Armed Bandits (MAB) framework can select an appropriate value of this parameter during the restarts performed by the search algorithm. Each arm represents a CHS with a given value for the step parameter and it is rewarded by its ability to improve the search. A training phase is introduced earlier in the search to help MAB choose a relevant arm. The experimental evaluation shows that this approach leads to significant improvements regarding CHS and other state-ofthe-art heuristics.
Databáze: OpenAIRE