Bias in machine learning software: why? how? what to do?

Autor:	Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Root (linguistics) Recall business.industry Computer science Resolution (logic) Affect (psychology) Machine Learning (cs.LG) Software Engineering (cs.SE) Social group Computer Science - Software Engineering Software Econometrics Marital status business
Zdroj:	ESEC/SIGSOFT FSE
DOI:	10.1145/3468264.3468537
Popis:	Increasingly, software is making autonomous decisions in case of criminal sentencing, approving credit cards, hiring employees, and so on. Some of these decisions show bias and adversely affect certain social groups (e.g. those defined by sex, race, age, marital status). Many prior works on bias mitigation take the following form: change the data or learners in multiple ways, then see if any of that improves fairness. Perhaps a better approach is to postulate root causes of bias and then applying some resolution strategy. This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples. Our Fair-SMOTE algorithm removes biased labels; and rebalances internal distributions such that based on sensitive attribute, examples are equal in both positive and negative classes. On testing, it was seen that this method was just as effective at reducing bias as prior approaches. Further, models generated via Fair-SMOTE achieve higher performance (measured in terms of recall and F1) than other state-of-the-art fairness improvement algorithms. To the best of our knowledge, measured in terms of number of analyzed learners and datasets, this study is one of the largest studies on bias mitigation yet presented in the literature.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a95b3266a97003f8c20e6d420c4acc52 https://doi.org/10.1145/3468264.3468537 Zobrazit plný text záznamu