Towards distribution-aware query answering in data markets

Autor: Abolfazl Asudeh, Fatemeh Nargesian
Rok vydání: 2022
Předmět:
Zdroj: Proceedings of the VLDB Endowment. 15:3137-3144
ISSN: 2150-8097
Popis: Addressing the increasing demand for data exchange has led to the development of data markets that facilitate transactional interactions between data buyers and data sellers. Still, cost-effective and distribution-aware query answering is a substantial challenge in these environments. In this paper, while differentiating different types of data markets, we take the initial steps towards addressing this challenge. In particular, we envision a unified query answering framework and discuss its functionalities. Our framework enables integrating data from different sources in a data market into a dataset that meets user-provided schema and distribution requirements cost-effectively. In order to facilitate consumers' query answering, our system discovers data views in the form of join-paths on relevant data sources, defines a get-next operation to query views, and estimates the cost of get-next on each view. The query answering engine then selects the next views to sample sequentially to collect the output data. Depending on the knowledge of the system from the underlying data sources, the view selection problem can be modeled as an instance of a multi-arm bandit or coupon collector's problem.
Databáze: OpenAIRE