mCRF and mRD: Two Classification Methods Based on a Novel Multiclass Label Noise Filtering Learning Framework
Autor: | Guoyin Wang, Zizhong Chen, Yong Zheng, Xinbo Gao, Elisabeth Giem, Shuyin Xia, Baiyun Chen |
---|---|
Rok vydání: | 2022 |
Předmět: |
Hyperparameter
Computer Networks and Communications business.industry Computer science Pattern recognition Computer Science Applications Random forest Multiclass classification Noise ComputingMethodologies_PATTERNRECOGNITION Binary classification Artificial Intelligence Code (cryptography) Artificial intelligence business Cluster analysis Software |
Zdroj: | IEEE Transactions on Neural Networks and Learning Systems. 33:2916-2930 |
ISSN: | 2162-2388 2162-237X |
Popis: | Mitigating label noise is a crucial problem in classification. Noise filtering is an effective method of dealing with label noise which does not need to estimate the noise rate or rely on any loss function. However, most filtering methods focus mainly on binary classification, leaving the more difficult counterpart problem of multiclass classification relatively unexplored. To remedy this deficit, we present a definition for label noise in a multiclass setting and propose a general framework for a novel label noise filtering learning method for multiclass classification. Two examples of noise filtering methods for multiclass classification, multiclass complete random forest (mCRF) and multiclass relative density, are derived from their binary counterparts using our proposed framework. In addition, to optimize the NI_threshold hyperparameter in mCRF, we propose two new optimization methods: a new voting cross-validation method and an adaptive method that employs a 2-means clustering algorithm. Furthermore, we incorporate SMOTE into our label noise filtering learning framework to handle the ubiquitous problem of imbalanced data in multiclass classification. We report experiments on both synthetic data sets and UCI benchmarks to demonstrate our proposed methods are highly robust to label noise in comparison with state-of-the-art baselines. All code and data results are available at https://github.com/syxiaa/Multiclass-Label-Noise-Filtering-Learning. |
Databáze: | OpenAIRE |
Externí odkaz: |