Efficient computation for probabilistic skyline over uncertain preferences

Autor: Vineet Padmanabhan, Venkateswara Rao Kagita, Arun K. Pujari, Anubhuti Garg
Rok vydání: 2015
Předmět:
Zdroj: Information Sciences. 324:146-162
ISSN: 0020-0255
DOI: 10.1016/j.ins.2015.06.041
Popis: We propose an efficient algorithm that can compute skyline probability exactly for reasonably large database.We introduce the concept of zero-contributing set which has zero effect in the signed aggregate of joint probabilities.We propose an incremental algorithm to compute skyline probability in dynamic scenarios wherein objects are added incrementally.The theoretical concepts developed helps to devise an efficient technique to compute skyline probability of all objects in the database.Detailed experimental analysis for real and synthetic datasets are reported to corroborate our findings. Efficient computation of skyline probability over uncertain preferences has not received much attention in the database community as compared to skyline probability computation over uncertain data. All known algorithms for probabilistic skyline computation over uncertain preferences attempt to find inexact value of skyline probability by resorting to sampling or to approximation scheme. Exact computation of skyline probability for database with uncertain preferences of moderate size is not possible with any of the existing algorithms. In this paper, we propose an efficient algorithm that can compute skyline probability exactly for reasonably large database. The inclusion-exclusion principle is used to express skyline probability in terms of joint probabilities of all possible combination. In this regard we introduce the concept of zero-contributing set which has zero effect in the signed aggregate of joint probabilities. Our algorithm employs a prefix-based k-level absorption to identify zero-contributing sets. It is shown empirically that only a very small portion of exponential search space remains after level wise application of prefix-based absorption. Thus it becomes possible to compute skyline probability with respect to large datasets. Detailed experimental analysis for real and synthetic datasets are reported to corroborate this claim. We also propose an incremental algorithm to compute skyline probability in dynamic scenarios wherein objects are added incrementally. Moreover, the theoretical concepts developed in this paper help to devise an efficient technique to compute skyline probability of all objects in the database. We show that the exponential search space is pruned once and then for each individual object skyline probability can be derived by inspecting a portion of the pruned lattice. We also use a concept of revival of absorbed pairs. We believe that this process is more efficient than computing the skyline probability individually.
Databáze: OpenAIRE