DiVA: Using Application-Specific Policies to ‘Dive’ into Vector Approximations
Autor: | Alex Delis, Spiros Evangelatos, Konstantinos Tsakalozos, Vassilis J. Tsotras, Marcos R. Vieira, Fotis Psallidas |
---|---|
Rok vydání: | 2015 |
Předmět: |
Structure (mathematical logic)
Theoretical computer science General Computer Science Exploit Computer science Data domain Search engine indexing Vector quantization 020207 software engineering 02 engineering and technology computer.software_genre Diva Tree (data structure) Data access 020204 information systems 0202 electrical engineering electronic engineering information engineering Data mining computer |
Zdroj: | The Computer Journal. 59:1363-1382 |
ISSN: | 1460-2067 0010-4620 |
Popis: | In high-dimensional data domains, the performance of conventional tree-based access structures is occasionally outperformed by simple sequential scans. To this end, the introduction of approximation-based methods helped speed-up queries by providing compact representations of stored data. Approximation methods exploit vector quantization to index data mainly presumed to follow a uniform distribution. In real-world environments however, we mostly encounter both skewed data and query distributions. To address this dual challenge, we propose DiVA that combines the selective use of an approximation approach with an indexing mechanism to organize data subspaces in a high fan-out hierarchical structure. Moreover, DiVA reorganizes its own elements after receiving application hints regarding data access patterns. These hints or policies trigger the restructuring and possible expansion of DiVA so as to offer finer indexing granularity and improved access times in subspaces emerging as ‘hot-spots’. The novelty of our approach lies in the self-organizing nature of DiVA driven by application-provided policies; the latter effectively guide the refinement of DiVA’s elements as new data arrive, existing data are updated and the nature of query workloads continually changes. An extensive experimental evaluation using real data shows that DiVA reduces up-to 64% of the total number of I/Os if compared with state-of-art methods including the VA-file, GC-tree and A-tree. |
Databáze: | OpenAIRE |
Externí odkaz: |