A Discretization-based Ensemble Learning Method for Classification in High-Speed Data Streams
Autor: | João Roberto Bertini Junior |
---|---|
Rok vydání: | 2019 |
Předmět: |
Data stream
0209 industrial biotechnology Concept drift Discretization Data stream mining Computer science Posterior probability Context (language use) 02 engineering and technology Base (topology) computer.software_genre Ensemble learning Bottleneck 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining computer |
Zdroj: | IJCNN |
DOI: | 10.1109/ijcnn.2019.8851703 |
Popis: | Data stream mining has attracted much attention of the machine learning community in the last decade. Motivated by the upcoming issues associated with data stream applications, such as concept drift and the velocity into which data needs to be processed, several methods have been proposed in the literature, most of them resulting from adaptations of traditional algorithms. Such methods are forced to satisfy hard requirements of restricted memory and processing time, while keeping track of the performance at the same time. In the classification context, ensembles are an effective and elegant way to handle this task. And mostly, the bottleneck of processing time and memory of an ensemble relies on the employed base learner and on the ensemble updating policy. This paper addresses both issues by proposing: 1) a fast base learning algorithm, which relies on discretizing every attribute range into disjoint intervals and associating, to each of them, a posterior probability relating it to a class; and 2) a static ensemble that comprises such base learners and handles concept drift without replacing base learners. Results comparing the proposed ensemble method to six ensemble approaches, on artificial and real data streams, showed it yields comparable results but with lower computational time; which makes the proposed ensemble an efficient alternative to high-speed data streams. |
Databáze: | OpenAIRE |
Externí odkaz: |