Zobrazeno 1 - 10
of 353
pro vyhledávání: '"JAGADISH, H. V."'
Fraud detection presents a challenging task characterized by ever-evolving fraud patterns and scarce labeled data. Existing methods predominantly rely on graph-based or sequence-based approaches. While graph-based approaches connect users through sha
Externí odkaz:
http://arxiv.org/abs/2408.00513
Cohort studies are of significant importance in the field of healthcare analysis. However, existing methods typically involve manual, labor-intensive, and expert-driven pattern definitions or rely on simplistic clustering techniques that lack medical
Externí odkaz:
http://arxiv.org/abs/2406.14015
Large language models (LLMs) can generate long-form and coherent text, yet they often hallucinate facts, which undermines their reliability. To mitigate this issue, inference-time methods steer LLM representations toward the "truthful directions" pre
Externí odkaz:
http://arxiv.org/abs/2405.00301
The potential harms of the under-representation of minorities in training data, particularly in multi-modal settings, is a well-recognized concern. While there has been extensive effort in detecting such under-representation, resolution has remained
Externí odkaz:
http://arxiv.org/abs/2402.01071
Language models and specialized table embedding models have recently demonstrated strong performance on many tasks over tabular data. Researchers and practitioners are keen to leverage these models in many new application contexts; but limited unders
Externí odkaz:
http://arxiv.org/abs/2310.07736
Before applying data analytics or machine learning to a data set, a vital step is usually the construction of an informative set of features from the data. In this paper, we present SMARTFEAT, an efficient automated feature engineering tool to assist
Externí odkaz:
http://arxiv.org/abs/2309.07856
The large size and fast growth of data repositories, such as data lakes, has spurred the need for data discovery to help analysts find related data. The problem has become challenging as (i) a user typically does not know what datasets exist in an en
Externí odkaz:
http://arxiv.org/abs/2301.04901
Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the rep
Externí odkaz:
http://arxiv.org/abs/2301.00719
Data discovery is a major challenge in enterprise data analysis: users often struggle to find data relevant to their analysis goals or even to navigate through data across data sources, each of which may easily contain thousands of tables. One common
Externí odkaz:
http://arxiv.org/abs/2212.14155
As the popularity of graph data increases, there is a growing need to count the occurrences of subgraph patterns of interest, for a variety of applications. Many graphs are massive in scale and also fully dynamic (with insertions and deletions of edg
Externí odkaz:
http://arxiv.org/abs/2211.06793