mCRF and mRD: Two Classification Methods Based on a Novel Multiclass Label Noise Filtering Learning Framework

Autor: Guoyin Wang, Zizhong Chen, Yong Zheng, Xinbo Gao, Elisabeth Giem, Shuyin Xia, Baiyun Chen
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Neural Networks and Learning Systems. 33:2916-2930
ISSN: 2162-2388
2162-237X
Popis: Mitigating label noise is a crucial problem in classification. Noise filtering is an effective method of dealing with label noise which does not need to estimate the noise rate or rely on any loss function. However, most filtering methods focus mainly on binary classification, leaving the more difficult counterpart problem of multiclass classification relatively unexplored. To remedy this deficit, we present a definition for label noise in a multiclass setting and propose a general framework for a novel label noise filtering learning method for multiclass classification. Two examples of noise filtering methods for multiclass classification, multiclass complete random forest (mCRF) and multiclass relative density, are derived from their binary counterparts using our proposed framework. In addition, to optimize the NI_threshold hyperparameter in mCRF, we propose two new optimization methods: a new voting cross-validation method and an adaptive method that employs a 2-means clustering algorithm. Furthermore, we incorporate SMOTE into our label noise filtering learning framework to handle the ubiquitous problem of imbalanced data in multiclass classification. We report experiments on both synthetic data sets and UCI benchmarks to demonstrate our proposed methods are highly robust to label noise in comparison with state-of-the-art baselines. All code and data results are available at https://github.com/syxiaa/Multiclass-Label-Noise-Filtering-Learning.
Databáze: OpenAIRE