Zobrazeno 1 - 10
of 4 118
pro vyhledávání: '"Demeester, P"'
Autor:
Decruyenaere, Alexander, Dehaene, Heidelinde, Rabaey, Paloma, Polet, Christiaan, Decruyenaere, Johan, Demeester, Thomas, Vansteelandt, Stijn
While synthetic data hold great promise for privacy protection, their statistical analysis poses significant challenges that necessitate innovative solutions. The use of deep generative models (DGMs) for synthetic data generation is known to induce c
Externí odkaz:
http://arxiv.org/abs/2411.04216
Negative Prompting (NP) is widely utilized in diffusion models, particularly in text-to-image applications, to prevent the generation of undesired features. In this paper, we show that conventional NP is limited by the assumption of a constant guidan
Externí odkaz:
http://arxiv.org/abs/2410.14398
In large organisations, identifying experts on a given topic is crucial in leveraging the internal knowledge spread across teams and departments. So-called enterprise expert retrieval systems automatically discover and structure employees' expertise
Externí odkaz:
http://arxiv.org/abs/2410.05018
Accurately modeling the relationships between skills is a crucial part of human resources processes such as recruitment and employee development. Yet, no benchmarks exist to evaluate such methods directly. We construct and release SkillMatch, a bench
Externí odkaz:
http://arxiv.org/abs/2410.05006
One of the central goals of causal machine learning is the accurate estimation of heterogeneous treatment effects from observational data. In recent years, meta-learning has emerged as a flexible, model-agnostic paradigm for estimating conditional av
Externí odkaz:
http://arxiv.org/abs/2409.15503
We present the SynSUM benchmark, a synthetic dataset linking unstructured clinical notes to structured background variables. The dataset consists of 10,000 artificial patient records containing tabular variables (like symptoms, diagnoses and underlyi
Externí odkaz:
http://arxiv.org/abs/2409.08936
Autor:
D'Oosterlinck, Karel, Xu, Winnie, Develder, Chris, Demeester, Thomas, Singh, Amanpreet, Potts, Christopher, Kiela, Douwe, Mehri, Shikib
Large Language Models (LLMs) are often aligned using contrastive alignment objectives and preference pair datasets. The interaction between model, paired data, and objective makes alignment a complicated procedure, sometimes producing subpar results.
Externí odkaz:
http://arxiv.org/abs/2408.06266
Autor:
Remy, François, Delobelle, Pieter, Avetisyan, Hayastan, Khabibullina, Alfiya, de Lhoneux, Miryam, Demeester, Thomas
The development of monolingual language models for low and mid-resource languages continues to be hindered by the difficulty in sourcing high-quality training data. In this study, we present a novel cross-lingual vocabulary transfer strategy, trans-t
Externí odkaz:
http://arxiv.org/abs/2408.04303
Bayesian networks are well-suited for clinical reasoning on tabular data, but are less compatible with natural language data, for which neural networks provide a successful framework. This paper compares and discusses strategies to augment Bayesian n
Externí odkaz:
http://arxiv.org/abs/2403.09481
In this paper, we present ECL, a novel multi-modal dataset containing the textual and numerical data from corporate 10K filings and associated binary bankruptcy labels. Furthermore, we develop and critically evaluate several classical and neural bank
Externí odkaz:
http://arxiv.org/abs/2401.12652