SSL-VFC4.5: An approach to adapt Very Fast C4.5 classification algorithm to deal with semi-supervised learning

Autor: Carlos Eduardo Nass, Agustín Alejandro Ortíz Díaz, Fabiano Baldo
Rok vydání: 2021
Zdroj: Anais do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD 2021).
DOI: 10.5753/sbbd.2021.17862
Popis: The growing popularity of audio and video streaming, industry 4.0 and IoT (Internet of Things) technologies contribute to the fast augment of the generation of various types of data. Therefore, to analyze these data for decision-making, supervised machine learning techniques need to be fast while keeping a suitable predicting performance even in many real-life scenarios where labeled data are expensive and hard to be gotten. To overcome this problem, this work proposes an adaptation to the Very Fast C4.5 (VFC4.5) algorithm implementing on it a semi-supervised impurity metric presented in the literature. The results pointed out that this adaptation can slightly increase the accuracy of the VFC4.5 when the datasets have the presence of a very few amount of labeled instances, but it increases the training time, especially when the number of labeled instances in the datasets increase.
Databáze: OpenAIRE