Techniques for Complex Analysis of Contemporary Data

Autor: Pavel Zezula, Michal Batko, Jakub Peschel
Rok vydání: 2020
Předmět:
Zdroj: Proceedings of the 2020 International Conference on Pattern Recognition and Intelligent Systems
DOI: 10.1145/3415048.3416097
Popis: Contemporary data objects are typically complex, semi-structured, or unstructured at all. Besides, objects are also related to form a network. In such a situation, data analysis requires not only the traditional attribute-based access but also access based on similarity as well as data mining operations. Though tools for such operations do exist, they usually specialise in operation and are available for specialized data structures supported by specific computer system environments. In contrary, advance analyses are obtained by application of several elementary access operations which in turn requires expert knowledge in multiple areas. In this paper, we propose a unification platform for various data analytical operators specified as a general-purpose analytical system ADAMiSS. An extensible data-mining and similarity-based set of operators over a common versatile data structure allow the recursive application of heterogeneous operations, thus allowing the definition of complex analytical processes, necessary to solve the contemporary analytical tasks. As a proof-of-concept, we present results that were obtained by our prototype implementation on two real-world data collections: the Twitter Higg's boson and the Kosarak datasets.
Databáze: OpenAIRE