Machine Learning Platform for Extreme Scale Computing on Compressed IoT Data

Autor:	Seshu Tirupathi, Dhaval Salwala, Giulio Zizzo, Ambrish Rawat, Mark Purcell, Soren Kejser Jensen, Christian Thomsen, Nguyen Ho, Carlos E. Muniz Cuza, Jonas Brusokas, Torben Bach Pedersen, Giorgos Alexiou, Giorgos Giannopoulos, Panagiotis Gidarakos, Alexandros Kalimeris, Stavros Maroulis, George Papastefanatos, Ioannis Psarros, Vassilis Stamatopoulos, Manolis Terrovitis
Přispěvatelé:	Tsumoto, Shusaku, Ohsawa, Yukio, Chen, Lei, Van den Poel, Dirk, Hu, Xiaohua, Motomura, Yoichi, Takagi, Takuya, Wu, Lingfei, Xie, Ying, Abe, Akihiro, Raghavan, Vijay
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Big Data Machine Learning Query processing Lossless Data Compression Edge Smart homes Data models Time series analysis Renewable Energy Sources Lossy Data Compression Industries Cloud
Zdroj:	Tirupathi, S, Salwala, D, Zizzo, G, Rawat, A, Purcell, M, Jensen, S K, Thomsen, C, Ho, N, Cuza, C E M, Brusokas, J, Pedersen, T B, Alexiou, G, Giannopoulos, G, Gidarakos, P, Kalimeris, A, Maroulis, S, Papastefanatos, G, Psarros, I, Stamatopoulos, V & Terrovitis, M 2022, Machine Learning Platform for Extreme Scale Computing on Compressed IoT Data . in S Tsumoto, Y Ohsawa, L Chen, D Van den Poel, X Hu, Y Motomura, T Takagi, L Wu, Y Xie, A Abe & V Raghavan (eds), 2022 IEEE International Conference on Big Data (Big Data) ., 10020540, IEEE Communications Society, pp. 3179-3185, 2022 IEEE International Conference on Big Data (Big Data), 17/12/2022 . https://doi.org/10.1109/BigData55660.2022.10020540
Popis:	With the lowering costs of sensors, high-volume and high-velocity data are increasingly being generated and analyzed, especially in IoT domains like energy and smart homes. Consequently, applications that require accurate short-term forecasts and predictions are also steadily increasing. In this paper, we provide an overview of a novel end-to-end platform that provides efficient ingestion, compression, transfer, query processing, and machine learning-based analytics for high-frequency and high-volume time series from IoT. The performance of the platform is evaluated using real-world dataset from RES installations. The results show the importance of high-frequency analytics and the surprisingly positive impact of error bounded lossy compression on machine learning in the form of AutoML. For example, when detecting yaw misalignments in wind turbines, an improvement of 9% in accuracy was observed for AutoML models on lossy compressed data compared to the current industry standard of 10-minute aggregated data. Thus, these small-scale experiments show the potential of the platform, and larger pilots are planned.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6a2118a8cdcd489e5491601b574f9032 https://vbn.aau.dk/da/publications/771ac3a1-1afc-44d3-b442-4f64c86ec4e0 Zobrazit plný text záznamu