Improving data partition schemes in Smart Grids via clustering data streams
Autor: | Andreu Sancho-Asensio, Itziar Arrieta-Salinas, Elisabet Golobardes, Virginia Jiménez-Ruano, José Enrique Armendáriz-Iñigo, Agustín Zaballos, Joan Navarro |
---|---|
Rok vydání: | 2014 |
Předmět: |
Data stream mining
Computer science business.industry Test data generation Supervised learning General Engineering Semi-supervised learning Machine learning computer.software_genre Replication (computing) Computer Science Applications Smart grid Artificial Intelligence Unsupervised learning Artificial intelligence Cluster analysis business computer |
Zdroj: | Expert Systems with Applications. 41:5832-5842 |
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2014.03.035 |
Popis: | Data mining techniques are traditionally divided into two distinct disciplines depending on the task to be performed by the algorithm: supervised learning and unsupervised learning. While the former aims at making accurate predictions after deeming an underlying structure in data—which requires the presence of a teacher during the learning phase—the latter aims at discovering regular-occurring patterns beneath the data without making any a priori assumptions concerning their underlying structure. The pure supervised model can construct a very accurate predictive model from data streams. However, in many real-world problems this paradigm may be ill-suited due to (1) the dearth of training examples and (2) the costs of labeling the required information to train the system. A sound use case of this concern is found when defining data replication and partitioning policies to store data emerged in the Smart Grids domain in order to adapt electric networks to current application demands (e.g., real time consumption, network self adapting). As opposed to classic electrical architectures, Smart Grids encompass a fully distributed scheme with several diverse data generation sources. Current data storage and replication systems fail at both coping with such overwhelming amount of heterogeneous data and at satisfying the stringent requirements posed by this technology (i.e., dynamic nature of the physical resources, continuous flow of information and autonomous behavior demands). The purpose of this paper is to apply unsupervised learning techniques to enhance the performance of data storage in Smart Grids. More specifically we have improved the eXtended Classifier System for Clustering (XCSc) algorithm to present a hybrid system that mixes data replication and partitioning policies by means of an online clustering approach. Conducted experiments show that the proposed system outperforms previous proposals and truly fits with the Smart Grid premises. |
Databáze: | OpenAIRE |
Externí odkaz: |