Constrained Oversampling: An Oversampling Approach to Reduce Noise Generation in Imbalanced Datasets With Class Overlapping

Autor: Changhui Liu, Sun Jin, Donghong Wang, Zichao Luo, Jianbo Yu, Binghai Zhou, Changlin Yang
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: IEEE Access, Vol 10, Pp 91452-91465 (2022)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2020.3018911
Popis: Imbalanced datasets are pervasive in classification tasks and would cause degradation of the performance of classifiers in predicting minority samples. Oversampling is effective in resolving the class imbalance problem. However, existing oversampling methods generally introduce noise examples into original datasets, especially when the datasets contain class overlapping regions. In this study, a new oversampling method named Constrained Oversampling is proposed to reduce noise generation in oversampling. This algorithm first extracts overlapping regions in the dataset. Then Ant Colony Optimization is applied to define the boundaries of minority regions. Third, oversampling under constraints is employed to synthesize new samples to get a balanced dataset. Our proposal distinguishes itself from other techniques by incorporating constraints in the oversampling process to inhibit noise generation. Experiments show that it outperforms various benchmark oversampling approaches. The explanation for the effectiveness of our method is given by studying the impact of class overlapping on imbalanced learning.
Databáze: Directory of Open Access Journals