On Handling Missing Values in Data Stream Mining Algorithms Based on the Restricted Boltzmann Machine
Autor: | Leszek Rutkowski, Piotr Duda, Danuta Rutkowska, Maciej Jaworski |
---|---|
Rok vydání: | 2019 |
Předmět: |
0209 industrial biotechnology
Restricted Boltzmann machine Concept drift Computer science Data stream mining Detector Probability density function 02 engineering and technology Missing data 020901 industrial engineering & automation Stochastic gradient descent 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Algorithm |
Zdroj: | Communications in Computer and Information Science ISBN: 9783030368012 ICONIP (5) |
DOI: | 10.1007/978-3-030-36802-9_37 |
Popis: | This paper addresses the issue of data stream mining using the Restricted Boltzmann Machine (RBM). Recently, it was demonstrated that the RBM can be useful as a concept drift detector in data streams with time-changing probability density. In this paper, we consider another problem which often occurs in real-life data streams, i.e. incomplete data. We propose two modifications of the RBM learning algorithms to make them able to handle missing values. The first one inserts an additional procedure before the positive phase of the Contrastive Divergence. This procedure aims at inferring the missing values in the visible layer by performing a fixed number of Gibbs steps. The second modification introduces dimension-dependent sizes of minibatches in the stochastic gradient descent method. The proposed methods are verified experimentally, demonstrating their usability for concept drift detection in data streams with incomplete data. |
Databáze: | OpenAIRE |
Externí odkaz: |