On Handling Missing Values in Data Stream Mining Algorithms Based on the Restricted Boltzmann Machine

Autor: Leszek Rutkowski, Piotr Duda, Danuta Rutkowska, Maciej Jaworski
Rok vydání: 2019
Předmět:
Zdroj: Communications in Computer and Information Science ISBN: 9783030368012
ICONIP (5)
DOI: 10.1007/978-3-030-36802-9_37
Popis: This paper addresses the issue of data stream mining using the Restricted Boltzmann Machine (RBM). Recently, it was demonstrated that the RBM can be useful as a concept drift detector in data streams with time-changing probability density. In this paper, we consider another problem which often occurs in real-life data streams, i.e. incomplete data. We propose two modifications of the RBM learning algorithms to make them able to handle missing values. The first one inserts an additional procedure before the positive phase of the Contrastive Divergence. This procedure aims at inferring the missing values in the visible layer by performing a fixed number of Gibbs steps. The second modification introduces dimension-dependent sizes of minibatches in the stochastic gradient descent method. The proposed methods are verified experimentally, demonstrating their usability for concept drift detection in data streams with incomplete data.
Databáze: OpenAIRE