Interpreting Clusters via Prototype Optimization
Autor: | Kseniia Kurishchenko, Emilio Carrizosa, Alfredo Marín, Dolores Romero Morales |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: |
Information Systems and Management
Biobjective optimization Computer science Strategy and Management Management Science and Operations Research Mixed-Integer Programming Set (abstract data type) Machine Learning Simulated data Cluster (physics) Cluster Analysis Prototypes Interpretability False positive rate Integer programming Algorithm True positive rate |
Zdroj: | Omega |
ISSN: | 1873-5274 0305-0483 |
Popis: | In this paper, we tackle the problem of enhancing the interpretability of the results of Cluster Analysis. Our goal is to find an explanation for each cluster, such that clusters are characterized as precisely and distinctively as possible, i.e., the explanation is fulfilled by as many as possible individuals of the corresponding cluster, true positive cases, and by as few as possible individuals in the remaining clusters, false positive cases. We assume that a dissimilarity between the individuals is given, and propose distance-based explanations, namely those defined by individuals that are close to its so-called prototype. To find the set of prototypes, we address the biobjective optimization problem that maximizes the total number of true positive cases across all clusters and minimizes the total number of false positive cases, while controlling the true positive rate as well as the false positive rate in each cluster. We develop two mathematical optimization models, inspired by classic Location Analysis problems, that differ in the way individuals are allocated to prototypes. We illustrate the explanations provided by these models and their accuracy in both real-life data as well as simulated data. |
Databáze: | OpenAIRE |
Externí odkaz: |