Continuous Similarity Search for Evolving Database

Autor: Hisashi Koga, Daiki Noguchi
Rok vydání: 2020
Předmět:
Zdroj: Similarity Search and Applications ISBN: 9783030609351
SISAP
DOI: 10.1007/978-3-030-60936-8_12
Popis: Similarity search for data streams has attracted much attention recently in the area of information recommendation. This paper studies a continuous set similarity search which regards the latest W items in a data stream as an evolving set. So far, a top-k similarity search problem called CEQ (Continuous similarity search for Evolving Query) has been researched in the literature, where the query evolves dynamically and the database consists of multiple static sets. By contrast, this paper examines a new top-k similarity search problem, where the query is a static set and the database consists of multiple dynamic sets extracted from multiple data streams. This new problem is named as CED (Continuous similarity search for Evolving Database). Our main contribution is to develop a pruning-based exact algorithm for CED. Though our algorithm is created by extending the previous pruning-based exact algorithm for CEQ, it runs substantially faster than the one which simply adapts the exact algorithm for CEQ to CED. Our algorithm achieves this speed by devising two novel techniques to refine the similarity upper bounds for pruning.
Databáze: OpenAIRE