An Effective Random Forest Approach for Mining Graded Multi-Label Data.

Autor: Farsal, Wissal, Ramdani, Mohammed, Anter, Samir
Předmět:
Zdroj: International Journal of Intelligent Engineering & Systems; 2023, Vol. 16 Issue 4, p548-557, 10p
Abstrakt: The graded multi-label classification (GMLC) is an extension of multi-label classification. Whilst a multilabel classifier is limited to predicting the set of relevant labels, a graded multi-label classifier predicts the degree of relevance of a set of given labels. A key challenge of this learning problem consists of modelling the dependencies among the labels to improve the predictive accuracy. The algorithm adaptation-based solutions, which modify the algorithms directly to handle GMLC were proven effective in modeling these dependencies in comparison to the transformation-based models which reduce the graded multi-label datasets into a set of multi-class or binary datasets. In this paper, we propose an adaptation of random forest algorithm (GML_DT) with an adapted CART (classification and regression trees). The adapted algorithm is based on a modified formula for the Gini Index which fully models the label dependencies. The performance of the new model was tested on a 101 benchmark datasets and compared against the most influential methods for GMLC. The evaluation metrics considered in the experimental study are proper to the graded multi-label setting, i.e., hamming loss and vertical 0-1 loss. The experimental results show that the proposed model outperforms the considered models for the graded multi-label performance metrics. The increase in performance and overall accuracy is quantified by a decrease in the hamming loss of more than 13% and a decrease in vertical 0-1 loss that exceeds 10% when compared to the state-of-the-art models. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index