Sampling Safety Coefficient for Multi-class Imbalance Oversampling Algorithm

Autor: DONG Minggang, LIU Ming, JING Chao
Jazyk: čínština
Rok vydání: 2020
Předmět:
Zdroj: Jisuanji kexue yu tansuo, Vol 14, Iss 10, Pp 1776-1786 (2020)
Druh dokumentu: article
ISSN: 1673-9418
DOI: 10.3778/j.issn.1673-9418.1911021
Popis: For the problem in multi-class imbalance, traditional oversampling algorithms easily lead to the issue of overgeneralization and overlap result with poor classification performance. To improve the performance of multi-class learning, a sampling safety coefficient for multi-class imbalance oversampling (SSCMIO) algorithm is proposed. First, with the aim of preventing overgeneralization, the neighbor sampling safety coefficient is designed to assign a small weight to those neighborhoods that may cause excessive generalization. Then, by considering the global characteristics of the sample points, the reverse neighbor sampling safety coefficient is presented to prevent new samples that invade into other classes, which alleviates the overlap between classes. Finally, the C4.5 decision tree is used as the base classifier. Compared with 7 representative oversampling algorithms within 16 public real data sets, SSCMIO can obtain more than 11 optimal values on precision, recall, F-measure, MG and MAUC, the maximum increase with the 5 metrics is 0.4818, 0.3053, 0.3420, 0.2664, and 0.1307 respectively. The experimental results show that the SSCMIO algorithm can achieve better classification performance than other 7 algorithms.
Databáze: Directory of Open Access Journals