Detection and Classification of Anomalies in Large Datasets on the Basis of Information Granules

Autor: Krystyna Kiersztyn, Adam Kiersztyn, Witold Pedrycz, Paweł Karczmarek
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Fuzzy Systems. 30:2850-2860
ISSN: 1941-0034
1063-6706
Popis: Anomaly (outlier) detection is one of the most important problems of modern data analysis. The sources of anomalies are varying. They can be the results of database users' mistakes, operational errors or just missing values. The problem is very important because of the fast growth of large data sets. Therefore, in this study, we present detailed results of work on the concept of Granular Computing-based approach to anomaly detection, classification, and gradation. The aim of the study is to introduce an innovative solution that allows the use of information granules to identify and classify anomalies. The novelty of the proposed solution consists in the use of fuzzy semantics implied by the statistical properties of the data considered. Moreover, instead of the classic approach to detecting anomalies in the data, it is proposed to determine the degree of anomaly for the data transformed to the new resulting state space. Thanks to the use of an innovative approach using the universal descriptor space, it is possible to determine the degree of anomaly, and by using various aggregation methods one can also specify its type.
Databáze: OpenAIRE