ODCA: An Outlier Detection Approach to Deal with Correlated Attributes
Autor: | Fabio Fassetti, Fabrizio Angiulli, Cristina Serrao |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Zdroj: | Big Data Analytics and Knowledge Discovery ISBN: 9783030865337 DaWaK |
DOI: | 10.1007/978-3-030-86534-4_17 |
Popis: | Datasets from different domains usually contain data defined over a wide set of attributes or features linked through correlation relationship. Moreover, there are some applications in which not all the attributes should be treated in the same fashion as some of them can be perceived like independent variables that are responsible for the definition of the expected behaviour of the remaining ones. Following this pattern, we focus on the detection of those data objects showing an anomalous behaviour on a subset of attributes, called behavioural, w.r.t the other ones, we call contextual. As a first contribution, we exploit Mixture Models to describe the data distribution over each pair of behavioral-contextual attributes and learn the correlation laws binding the data on each bidimensional space. Then, we design a probability measure aimed at scoring subsequently observed objects based on how much their behaviour differs from the usual behavioural attribute values. Finally, we join the contributions calculated in each bidimensional space to provide a global outlierness measure. We test our method on both synthetic and real dataset to demonstrate its effectiveness when studying anomalous behaviour in a specific context and its ability in outperforming some competitive baselines. |
Databáze: | OpenAIRE |
Externí odkaz: |