A Discretization-based Ensemble Learning Method for Classification in High-Speed Data Streams

Autor: João Roberto Bertini Junior
Rok vydání: 2019
Předmět:
Zdroj: IJCNN
DOI: 10.1109/ijcnn.2019.8851703
Popis: Data stream mining has attracted much attention of the machine learning community in the last decade. Motivated by the upcoming issues associated with data stream applications, such as concept drift and the velocity into which data needs to be processed, several methods have been proposed in the literature, most of them resulting from adaptations of traditional algorithms. Such methods are forced to satisfy hard requirements of restricted memory and processing time, while keeping track of the performance at the same time. In the classification context, ensembles are an effective and elegant way to handle this task. And mostly, the bottleneck of processing time and memory of an ensemble relies on the employed base learner and on the ensemble updating policy. This paper addresses both issues by proposing: 1) a fast base learning algorithm, which relies on discretizing every attribute range into disjoint intervals and associating, to each of them, a posterior probability relating it to a class; and 2) a static ensemble that comprises such base learners and handles concept drift without replacing base learners. Results comparing the proposed ensemble method to six ensemble approaches, on artificial and real data streams, showed it yields comparable results but with lower computational time; which makes the proposed ensemble an efficient alternative to high-speed data streams.
Databáze: OpenAIRE