Fair Clustering with Fair Correspondence Distribution

Autor: Hyungjin Ko, Jaewook Lee, Taeho Yoon, Woojin Lee, Junyoung Byun
Rok vydání: 2021
Předmět:
Zdroj: Information Sciences. 581:155-178
ISSN: 0020-0255
Popis: In recent years, the issue of fairness has become important in the field of machine learning . In clustering problems , fairness is defined in terms of consistency in that the balance ratio of data with different sensitive attribute values remains constant for each cluster. Fairness problems are important in real-world applications, for example, when the recommendation system provides targeted advertisements or job offers based on the clustering result of candidates, the minority group may not get the same level of opportunity as the majority group if the clustering result is unfair. In this study, we propose a novel distribution-based fair clustering approach. Considering a distribution in which the sample is biased by society, we try to find clusters from a fair correspondence distribution. Our method uses the support vector method and a dynamical system to comprehensively divide the entire data space into atomic cells before reassembling them fairly to form the clusters. Theoretical results derive the upper bound of the generalization error of the corresponding clustering function in the fair correspondence distribution when atomic cells are connected fairly, allowing us to present an algorithm to achieve fairness. Experimental results show that our algorithm beneficially increases fairness while reducing computation time for various datasets.
Databáze: OpenAIRE