Deep Reconciled and Self-Paced TSK Fuzzy System Ensemble for Imbalanced Data Classification: Architecture, Interpretability, and Theory

Autor: Zhang, Yuanpeng, Wang, Guanjin, Zhou, Ta, Ren, Ge, Lam, Saikit, Ding, Weiping, Cai, Jing
Zdroj: IEEE Transactions on Fuzzy Systems; November 2024, Vol. 32 Issue: 11 p6185-6198, 14p
Abstrakt: Stacking-based takagi-sugeno-kang (TSK) fuzzy system ensemble has been successfully applied to imbalanced data classification. However, there still exist many challenges that need to be further addressed. For example, during stacking, augmenting output variables into the input feature space reduces the interpretability of antecedents of fuzzy rules. During sampling for balancing, discovering informative samples usually only relies on training samples, which may reduce generalizability. More importantly, there is no theory to support the reliability of stacking. To address the aforementioned challenges, in this study, we propose a deep reconciled and self-paced TSK fuzzy system ensemble framework termed D-RSP-TSKE for imbalanced data classification. Compared with the existing ensemble frameworks, its superiorities can be exhibited from the following three aspects. First, in the first layer, we use random undersampling to generate a class-balanced training set to train an initial zero-order TSK fuzzy classifier. Based on the TSK fuzzy classifier, then we define classifier-specific and testing-compatible sample sensitivity to discover informative (high-sensitive) samples and design a reconciled and self-paced sampling approach to balance the minority class for the training of the following layers. Second, to improve the interpretability of antecedents of fuzzy rules, we propose to transfer the output variables from antecedents to consequents through equivalent mathematical transformations while keeping the final output unchanged. These transferred output variables are interpreted as the dynamic fuzzy rule confidence. Third, furthermore, we engage in a comprehensive theoretical examination of our stacking-based ensemble to elucidate the underlying mechanisms that enable the stacking strategy to consistently deliver superior performance. We conduct tests and comparisons on 7 artificial datasets and 30 real-world datasets to evaluate D-RSP-TSKE. The experimental results demonstrate the effectiveness and interpretability of D-RSP-TSKE for imbalanced data classification.
Databáze: Supplemental Index