Identifying exceptional (dis)agreement between groups

Autor: Marc Plantevit, Philippe Lamarre, Sylvie Cazalens, Adnene Belfodil
Přispěvatelé: Base de Données (BD), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), Data Mining and Machine Learning (DM2L)
Rok vydání: 2019
Předmět:
Zdroj: Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery, Springer, 2020, 34 (2), pp.394-442. ⟨10.1007/s10618-019-00665-9⟩
ISSN: 1573-756X
1384-5810
DOI: 10.1007/s10618-019-00665-9
Popis: International audience; Under the term behavioral data, we consider any type of data featuring individuals performing observable actions on entities. For instance, voting data depict parliamentarians who express their votes w.r.t. legislative procedures. In this work, we address the problem of discovering exceptional (dis)agreement patterns in such data, i.e., groups of individuals that exhibit an unexpected (dis)agreement under specific contexts compared to what is observed in overall terms. To tackle this problem, we design a generic approach , rooted in the Subgroup Discovery/Exceptional Model Mining framework , which enables the discovery of such patterns in two different ways. A branch-and-bound algorithm ensures an efficient exhaustive search of the underlying search space by leveraging closure operators and optimistic estimates on the interestingness measures. A second algorithm abandons the completeness by using a sampling paradigm which provides an alternative when an exhaustive search approach becomes unfeasible. To illustrate the usefulness of discovering exceptional (dis)agreement patterns, we report a comprehensive experimental study on four real-world datasets relevant to three different application domains: political analysis, rating data analysis and healthcare surveillance.
Databáze: OpenAIRE