An Improved Multi-label Classifier Chain Method for Automated Text Classification

Autor: Zuhaila Ali Othman, Adeleke Abdullahi, Shamsul Kamal Ahmad Khalid, Noor Azah Samsudin
Rok vydání: 2021
Předmět:
Zdroj: International Journal of Advanced Computer Science and Applications. 12
ISSN: 2156-5570
2158-107X
DOI: 10.14569/ijacsa.2021.0120352
Popis: Automated text classification is the task of grouping documents (text) automatically into categories from a predefined set. The conventional approach to classification involves mapping a single class label each to a data point (instance). In multi-label classification (MLC), the task is to develop models that could predict multiple class labels to a data instance. There exist several MLC methods such as classifier chain (CC) and binary relevance (BR). However, there are drawbacks with these methods such as random label sequence ordering issue. This study attempts to address this issue peculiar with the classifier chain method. In this paper, a hybrid heuristic evolutionary-based technique is proposed. The proposed PSOGCC is a combination of particle swarm optimization (PSO) and genetic algorithm (GA). Genetic operators of GA are integrated with the basic PSO algorithm for finding the global best solution representing an optimized label sequence order in the chain classifier. In the experiment, three MLC methods: BR, CC, and PSOGCC are implemented using five benchmark multi-label datasets and five standard evaluation metrics. The proposed PSOGCC method improved the predictive performance of the chain classifier by obtaining the best results of 98.66%, 99.5%, 99.16%, 99.33%, 0.0011 accuracy, precision, recall, f1 Score, and Hammingloss values, respectively.
Databáze: OpenAIRE