Continuous Similarity Search for Evolving Database
Autor: | Hisashi Koga, Daiki Noguchi |
---|---|
Rok vydání: | 2020 |
Předmět: |
Data stream
Database Computer science Data stream mining Nearest neighbor search 02 engineering and technology computer.software_genre Set (abstract data type) Exact algorithm Similarity (network science) 020204 information systems Sliding window protocol 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Pruning (decision trees) computer |
Zdroj: | Similarity Search and Applications ISBN: 9783030609351 SISAP |
DOI: | 10.1007/978-3-030-60936-8_12 |
Popis: | Similarity search for data streams has attracted much attention recently in the area of information recommendation. This paper studies a continuous set similarity search which regards the latest W items in a data stream as an evolving set. So far, a top-k similarity search problem called CEQ (Continuous similarity search for Evolving Query) has been researched in the literature, where the query evolves dynamically and the database consists of multiple static sets. By contrast, this paper examines a new top-k similarity search problem, where the query is a static set and the database consists of multiple dynamic sets extracted from multiple data streams. This new problem is named as CED (Continuous similarity search for Evolving Database). Our main contribution is to develop a pruning-based exact algorithm for CED. Though our algorithm is created by extending the previous pruning-based exact algorithm for CEQ, it runs substantially faster than the one which simply adapts the exact algorithm for CEQ to CED. Our algorithm achieves this speed by devising two novel techniques to refine the similarity upper bounds for pruning. |
Databáze: | OpenAIRE |
Externí odkaz: |