Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Chu, Xu"'
Autor:
Wu, Renzhi, Chunduri, Pramod, Shah, Dristi J, Aravind, Ashmitha Julius, Payani, Ali, Chu, Xu, Arulraj, Joy, Rong, Kexin
Publikováno v:
Published on International Conference on Very Large Databases 2024
In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-dro
Externí odkaz:
http://arxiv.org/abs/2405.18334
Publikováno v:
ACM SIGMOD 2023
Data preprocessing is a crucial step in the machine learning process that transforms raw data into a more usable format for downstream ML models. However, it can be costly and time-consuming, often requiring the expertise of domain experts. Existing
Externí odkaz:
http://arxiv.org/abs/2308.10915
Entity matching (EM) refers to the problem of identifying pairs of data records in one or more relational tables that refer to the same entity in the real world. Supervised machine learning (ML) models currently achieve state-of-the-art matching perf
Externí odkaz:
http://arxiv.org/abs/2211.06975
To reduce the human annotation efforts, the programmatic weak supervision (PWS) paradigm abstracts weak supervision sources as labeling functions (LFs) and involves a label model to aggregate the output of multiple LFs to produce training labels. Mos
Externí odkaz:
http://arxiv.org/abs/2207.13545
Publikováno v:
PVLDB, 15(2): 272 - 284, 2022
Estimating the number of distinct values (NDV) in a column is useful for many tasks in database systems, such as columnstore compression and data profiling. In this work, we focus on how to derive accurate NDV estimations from random (online/offline)
Externí odkaz:
http://arxiv.org/abs/2202.02800
Publikováno v:
PVLDB, 14(12): 2735-2738, 2021
Entity matching (EM) refers to the problem of identifying tuple pairs in one or more relations that refer to the same real world entities. Supervised machine learning (ML) approaches, and deep learning based approaches in particular, typically achiev
Externí odkaz:
http://arxiv.org/abs/2106.10821
Machine learning (ML) is increasingly being used to make decisions in our society. ML models, however, can be unfair to certain demographic groups (e.g., African Americans or females) according to various fairness metrics. Existing techniques for pro
Externí odkaz:
http://arxiv.org/abs/2103.09055
Fuzzy similarity join is an important database operator widely used in practice. So far the research community has focused exclusively on optimizing fuzzy join \textit{scalability}. However, practitioners today also struggle to optimize fuzzy-join \t
Externí odkaz:
http://arxiv.org/abs/2103.04489
Machine learning (ML) applications have been thriving recently, largely attributed to the increasing availability of data. However, inconsistency and incomplete information are ubiquitous in real-world datasets, and their impact on ML applications re
Externí odkaz:
http://arxiv.org/abs/2005.05117
Entity resolution (ER) refers to the problem of matching records in one or more relations that refer to the same real-world entity. While supervised machine learning (ML) approaches achieve the state-of-the-art results, they require a large amount of
Externí odkaz:
http://arxiv.org/abs/1908.06049