Multivariate correlations discovery in static and streaming data

Autor: Koen Minartz, Jens E. d'Hondt, Odysseas Papapetrou
Rok vydání: 2022
Předmět:
Zdroj: Proceedings of the VLDB Endowment. 15:1266-1278
ISSN: 2150-8097
DOI: 10.14778/3514061.3514072
Popis: Correlation analysis is an invaluable tool in many domains, for better understanding data and extracting salient insights. Most works to date focus on detecting high pairwise correlations. A generalization of this problem with known applications but no known efficient solutions involves the discovery of strong multivariate correlations, i.e., finding vectors (typically in the order of 3 to 5 vectors) that exhibit a strong dependence when considered altogether. In this work we propose algorithms for detecting multivariate correlations in static and streaming data. Our algorithms, which rely on novel theoretical results, support two different correlation measures, and allow for additional constraints. Our extensive experimental evaluation examines the properties of our solution and demonstrates that our algorithms outperform the state-of-the-art, typically by an order of magnitude.
Databáze: OpenAIRE