Techniques for Complex Analysis of Contemporary Data
Autor: | Pavel Zezula, Michal Batko, Jakub Peschel |
---|---|
Rok vydání: | 2020 |
Předmět: |
Similarity (geometry)
business.industry Computer science Nearest neighbor search Data analysis Similarity search Pattern mining 02 engineering and technology Data structure computer.software_genre Set (abstract data type) 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining Artificial intelligence business computer |
Zdroj: | Proceedings of the 2020 International Conference on Pattern Recognition and Intelligent Systems |
DOI: | 10.1145/3415048.3416097 |
Popis: | Contemporary data objects are typically complex, semi-structured, or unstructured at all. Besides, objects are also related to form a network. In such a situation, data analysis requires not only the traditional attribute-based access but also access based on similarity as well as data mining operations. Though tools for such operations do exist, they usually specialise in operation and are available for specialized data structures supported by specific computer system environments. In contrary, advance analyses are obtained by application of several elementary access operations which in turn requires expert knowledge in multiple areas. In this paper, we propose a unification platform for various data analytical operators specified as a general-purpose analytical system ADAMiSS. An extensible data-mining and similarity-based set of operators over a common versatile data structure allow the recursive application of heterogeneous operations, thus allowing the definition of complex analytical processes, necessary to solve the contemporary analytical tasks. As a proof-of-concept, we present results that were obtained by our prototype implementation on two real-world data collections: the Twitter Higg's boson and the Kosarak datasets. |
Databáze: | OpenAIRE |
Externí odkaz: |