Extensive Experimental Evaluation of Self-Organizing Maps for Automatic Classification of a Multi-Class Multi-Label Corpus

Autor: Eleni Giannopoulou, Nikolas Mitrou
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: IEEE Access, Vol 6, Pp 67385-67403 (2018)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2018.2875497
Popis: This paper aims at bridging the gap between feature selection and feature space size by utilizing both square and non-square self-organizing maps under different configuration scenarios for classifying a multi-class multi-label corpus, the Reuters Mod Apte’ split. The selection of non-square maps is based on a heuristic process for finding the suitable size for the self-organizing map. Vector construction is based on a simple, yet effective procedure aiming at transforming the vectors from multi-label to uni-label. The proposed solution improves classification efficiency not only in terms of accuracy but also in computational resources needed and time for training. Extensive experiments were conducted, using different configurations regarding map and vector sizes, and training cycles, also utilizing context words, to assess their impact in the classifier’s performance. Furthermore, an intelligent algorithm for label selection is being proposed, aiming to show that the neighboring nodes on the map affect the selection of labels for a specific node. According to the experiments conducted, our approach achieves 10% increase in Macro-Average F1 scores, $30\times $ decrease in vector dimensionality, and approximately $34\times $ smaller maps when compared to the baseline scenario.
Databáze: Directory of Open Access Journals