Zobrazeno 1 - 10
of 8 650
pro vyhledávání: '"Demeester, A."'
Autor:
Meeus, Matthieu, Rathé, Anthony, Remy, François, Delobelle, Pieter, Decorte, Jens-Joris, Demeester, Thomas
While Large Language Models (LLMs) have shown remarkable capabilities in natural language understanding and generation, their performance often lags in lower-resource, non-English languages due to biases in the training data. In this work, we explore
Externí odkaz:
http://arxiv.org/abs/2412.07633
Autor:
Decruyenaere, Alexander, Dehaene, Heidelinde, Rabaey, Paloma, Polet, Christiaan, Decruyenaere, Johan, Demeester, Thomas, Vansteelandt, Stijn
While synthetic data hold great promise for privacy protection, their statistical analysis poses significant challenges that necessitate innovative solutions. The use of deep generative models (DGMs) for synthetic data generation is known to induce c
Externí odkaz:
http://arxiv.org/abs/2411.04216
Negative Prompting (NP) is widely utilized in diffusion models, particularly in text-to-image applications, to prevent the generation of undesired features. In this paper, we show that conventional NP is limited by the assumption of a constant guidan
Externí odkaz:
http://arxiv.org/abs/2410.14398
In large organisations, identifying experts on a given topic is crucial in leveraging the internal knowledge spread across teams and departments. So-called enterprise expert retrieval systems automatically discover and structure employees' expertise
Externí odkaz:
http://arxiv.org/abs/2410.05018
Accurately modeling the relationships between skills is a crucial part of human resources processes such as recruitment and employee development. Yet, no benchmarks exist to evaluate such methods directly. We construct and release SkillMatch, a bench
Externí odkaz:
http://arxiv.org/abs/2410.05006
One of the central goals of causal machine learning is the accurate estimation of heterogeneous treatment effects from observational data. In recent years, meta-learning has emerged as a flexible, model-agnostic paradigm for estimating conditional av
Externí odkaz:
http://arxiv.org/abs/2409.15503
We present the SynSUM benchmark, a synthetic dataset linking unstructured clinical notes to structured background variables. The dataset consists of 10,000 artificial patient records containing tabular variables (like symptoms, diagnoses and underlyi
Externí odkaz:
http://arxiv.org/abs/2409.08936
Autor:
D'Oosterlinck, Karel, Xu, Winnie, Develder, Chris, Demeester, Thomas, Singh, Amanpreet, Potts, Christopher, Kiela, Douwe, Mehri, Shikib
Large Language Models (LLMs) are often aligned using contrastive alignment objectives and preference pair datasets. The interaction between model, paired data, and objective makes alignment a complicated procedure, sometimes producing subpar results.
Externí odkaz:
http://arxiv.org/abs/2408.06266
Autor:
Remy, François, Delobelle, Pieter, Avetisyan, Hayastan, Khabibullina, Alfiya, de Lhoneux, Miryam, Demeester, Thomas
The development of monolingual language models for low and mid-resource languages continues to be hindered by the difficulty in sourcing high-quality training data. In this study, we present a novel cross-lingual vocabulary transfer strategy, trans-t
Externí odkaz:
http://arxiv.org/abs/2408.04303
Bayesian networks are well-suited for clinical reasoning on tabular data, but are less compatible with natural language data, for which neural networks provide a successful framework. This paper compares and discusses strategies to augment Bayesian n
Externí odkaz:
http://arxiv.org/abs/2403.09481