Zobrazeno 1 - 10
of 2 085
pro vyhledávání: '"D'Amour, P."'
Autor:
Zhu, Yuchen, de Souza, Daniel Augusto, Shi, Zhengyan, Yang, Mengyue, Minervini, Pasquale, D'Amour, Alexander, Kusner, Matt J.
We address the problem of reward hacking, where maximising a proxy reward does not necessarily increase the true reward. This is a key concern for Large Language Models (LLMs), as they are often fine-tuned on human preferences that may not accurately
Externí odkaz:
http://arxiv.org/abs/2412.16475
Autor:
Schrouff, Jessica, Bellot, Alexis, Rannen-Triki, Amal, Malek, Alan, Albuquerque, Isabela, Gretton, Arthur, D'Amour, Alexander, Chiappa, Silvia
Failures of fairness or robustness in machine learning predictive settings can be due to undesired dependencies between covariates, outcomes and auxiliary factors of variation. A common strategy to mitigate these failures is data balancing, which att
Externí odkaz:
http://arxiv.org/abs/2406.17433
Autor:
Anthis, Jacy, Lum, Kristian, Ekstrand, Michael, Feller, Avi, D'Amour, Alexander, Tan, Chenhao
The need for fair AI is increasingly clear in the era of general-purpose systems such as ChatGPT, Gemini, and other large language models (LLMs). However, the increasing complexity of human-AI interaction and its social impacts have raised questions
Externí odkaz:
http://arxiv.org/abs/2406.03198
Autor:
Tsai, Katherine, Pfohl, Stephen R., Salaudeen, Olawale, Chiou, Nicole, Kusner, Matt J., D'Amour, Alexander, Koyejo, Sanmi, Gretton, Arthur
We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shi
Externí odkaz:
http://arxiv.org/abs/2403.07442
Autor:
Alabdulmohsin, Ibrahim, Wang, Xiao, Steiner, Andreas, Goyal, Priya, D'Amour, Alexander, Zhai, Xiaohua
Publikováno v:
ICLR 2024
We study the effectiveness of data-balancing for mitigating biases in contrastive language-image pretraining (CLIP), identifying areas of strength and limitation. First, we reaffirm prior conclusions that CLIP models can inadvertently absorb societal
Externí odkaz:
http://arxiv.org/abs/2403.04547
Bias benchmarks are a popular method for studying the negative impacts of bias in LLMs, yet there has been little empirical investigation of whether these benchmarks are actually indicative of how real world harm may manifest in the real world. In th
Externí odkaz:
http://arxiv.org/abs/2402.12649
Autor:
Watson-Daniels, Jamelle, Calmon, Flavio du Pin, D'Amour, Alexander, Long, Carol, Parkes, David C., Ustun, Berk
Machine learning models in modern mass-market applications are often updated over time. One of the foremost challenges faced is that, despite increasing overall performance, these updates may flip specific model predictions in unpredictable ways. In
Externí odkaz:
http://arxiv.org/abs/2402.07745
Autor:
Wang, Zihao, Nagpal, Chirag, Berant, Jonathan, Eisenstein, Jacob, D'Amour, Alex, Koyejo, Sanmi, Veitch, Victor
A common approach for aligning language models to human preferences is to first learn a reward model from preference data, and then use this reward model to update the language model. We study two closely related problems that arise in this approach.
Externí odkaz:
http://arxiv.org/abs/2402.00742
Autor:
Lydia Trippler, Said Mohammed Ali, Msanif Othman Masoud, Zahor Hamad Mohammed, Amour Khamis Amour, Khamis Rashid Suleiman, Shaali Makame Ame, Fatma Kabole, Jan Hattendorf, Stefanie Knopp
Publikováno v:
Parasites & Vectors, Vol 17, Iss 1, Pp 1-13 (2024)
Abstract Background The World Health Organization (WHO) has set the goal of eliminating schistosomiasis as a public health problem globally by 2030 and to interrupt transmission in selected areas. Chemical snail control is one important measure to re
Externí odkaz:
https://doaj.org/article/9f2af590f3a14df2abae600370cdf5ef
Autor:
Beirami, Ahmad, Agarwal, Alekh, Berant, Jonathan, D'Amour, Alexander, Eisenstein, Jacob, Nagpal, Chirag, Suresh, Ananda Theertha
A simple and effective method for the alignment of generative models is the best-of-$n$ policy, where $n$ samples are drawn from a base policy, and ranked based on a reward function, and the highest ranking one is selected. A commonly used analytical
Externí odkaz:
http://arxiv.org/abs/2401.01879