Zobrazeno 1 - 10
of 219
pro vyhledávání: '"KIMELFELD, BENNY"'
Datasets may include errors, and specifically violations of integrity constraints, for various reasons. Standard techniques for ``minimal-cost'' database repairing resolve these violations by aiming for minimum change in the data, and in the process,
Externí odkaz:
http://arxiv.org/abs/2410.16501
Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a Euclidean space, and so on. We study the computational complexity of repairing inconsistent databases tha
Externí odkaz:
http://arxiv.org/abs/2409.16713
Autor:
Light, Dean, Aiashy, Ahmad, Diab, Mahmoud, Nachmias, Daniel, Vansummeren, Stijn, Kimelfeld, Benny
Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms of expres
Externí odkaz:
http://arxiv.org/abs/2409.01736
Recent studies investigated the challenge of assessing the strength of a given claim extracted from a dataset, particularly the claim's potential of being misleading and cherry-picked. We focus on claims that compare answers to an aggregate query pos
Externí odkaz:
http://arxiv.org/abs/2408.14974
There is a large volume of late antique and medieval Hebrew texts. They represent a crucial linguistic and cultural bridge between Biblical and modern Hebrew. Poetry is prominent in these texts and one of its main haracteristics is the frequent use o
Externí odkaz:
http://arxiv.org/abs/2402.17371
Publikováno v:
CIKM 2023
Machinery for data analysis often requires a numeric representation of the input. Towards that, a common practice is to embed components of structured data into a high-dimensional vector space. We study the embedding of the tuples of a relational dat
Externí odkaz:
http://arxiv.org/abs/2401.11215
Publikováno v:
SIGMOD Rec. 52(2): 6-17 (2023)
Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Ma
Externí odkaz:
http://arxiv.org/abs/2401.06234
We propose and study a framework for quantifying the importance of the choices of parameter values to the result of a query over a database. These parameters occur as constants in logical queries, such as conjunctive queries. In our framework, the im
Externí odkaz:
http://arxiv.org/abs/2401.04606
Local explanation methods highlight the input tokens that have a considerable impact on the outcome of classifying the document at hand. For example, the Anchor algorithm applies a statistical analysis of the sensitivity of the classifier to changes
Externí odkaz:
http://arxiv.org/abs/2312.07991
Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish
Externí odkaz:
http://arxiv.org/abs/2309.11663