Toward Mining Capricious Data Streams: A Generative Approach

Autor: Ege Beyazit, Baijun Wu, Yi He, Sheng Chen, Di Wu, Xindong Wu
Rok vydání: 2021
Předmět:
Zdroj: IEEE Transactions on Neural Networks and Learning Systems. 32:1228-1240
ISSN: 2162-2388
2162-237X
DOI: 10.1109/tnnls.2020.2981386
Popis: Learning with streaming data has received extensive attention during the past few years. Existing approaches assume that the feature space is fixed or changes by following explicit regularities, limiting their applicability in real-time applications. For example, in a smart healthcare platform, the feature space of the patient data varies when different medical service providers use nonidentical feature sets to describe the patients' symptoms. To fill the gap, we in this article propose a novel learning paradigm, namely, Generative Learning With Streaming Capricious (GLSC) data, which does not make any assumption on the feature space dynamics. In other words, GLSC handles the data streams with a varying feature space, where each arriving data instance can arbitrarily carry new features and/or stop carrying partial old features. Specifically, GLSC trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. The universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve the performance with theoretical guarantees. The experimental results demonstrate that GLSC achieves conspicuous performance on both synthetic and real data sets.
Databáze: OpenAIRE