Zobrazeno 1 - 10
of 14 734
pro vyhledávání: '"Ahmadian A"'
Autor:
Ahmadian Ali, Tavakkolisaiej Vahid, Scherer Torsten, Mazilkin Andrey, Ivanisenko Yulia, Kübel Christian
Publikováno v:
BIO Web of Conferences, Vol 129, p 23010 (2024)
Externí odkaz:
https://doaj.org/article/a97987665a15460dbdf4aa372f988b93
Autor:
Aakanksha, Ahmadian, Arash, Goldfarb-Tarrant, Seraphina, Ermis, Beyza, Fadaee, Marzieh, Hooker, Sara
Large Language Models (LLMs) have been adopted and deployed worldwide for a broad variety of applications. However, ensuring their safe use remains a significant challenge. Preference training and safety measures often overfit to harms prevalent in W
Externí odkaz:
http://arxiv.org/abs/2410.10801
Finetuning large language models on instruction data is crucial for enhancing pre-trained knowledge and improving instruction-following capabilities. As instruction datasets proliferate, selecting optimal data for effective training becomes increasin
Externí odkaz:
http://arxiv.org/abs/2409.11378
The focus of this paper is a key component of a methodology for understanding, interpolating, and predicting fish movement patterns based on spatiotemporal data recorded by spatially static acoustic receivers. For periods of time, fish may be far fro
Externí odkaz:
http://arxiv.org/abs/2408.13220
Superior Scoring Rules for Probabilistic Evaluation of Single-Label Multi-Class Classification Tasks
This study introduces novel superior scoring rules called Penalized Brier Score (PBS) and Penalized Logarithmic Loss (PLL) to improve model evaluation for probabilistic classification. Traditional scoring rules like Brier Score and Logarithmic Loss s
Externí odkaz:
http://arxiv.org/abs/2407.17697
Preference optimization techniques have become a standard final stage for training state-of-art large language models (LLMs). However, despite widespread adoption, the vast majority of work to-date has focused on first-class citizen languages like En
Externí odkaz:
http://arxiv.org/abs/2407.02552
Autor:
Grinsztajn, Nathan, Flet-Berliac, Yannis, Azar, Mohammad Gheshlaghi, Strub, Florian, Wu, Bill, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Pietquin, Olivier, Geist, Matthieu
To better align Large Language Models (LLMs) with human judgment, Reinforcement Learning from Human Feedback (RLHF) learns a reward model and then optimizes it using regularized RL. Recently, direct alignment methods were introduced to learn such a f
Externí odkaz:
http://arxiv.org/abs/2406.19188
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
Autor:
Flet-Berliac, Yannis, Grinsztajn, Nathan, Strub, Florian, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Azar, Mohammad Gheshlaghi, Pietquin, Olivier, Geist, Matthieu
Reinforcement Learning (RL) has been used to finetune Large Language Models (LLMs) using a reward model trained from preference data, to better align with human judgment. The recently introduced direct alignment methods, which are often simpler, more
Externí odkaz:
http://arxiv.org/abs/2406.19185
Autor:
Aakanksha, Ahmadian, Arash, Ermis, Beyza, Goldfarb-Tarrant, Seraphina, Kreutzer, Julia, Fadaee, Marzieh, Hooker, Sara
A key concern with the concept of "alignment" is the implicit question of "alignment to what?". AI systems are increasingly used across the world, yet safety alignment is often focused on homogeneous monolingual settings. Additionally, preference tra
Externí odkaz:
http://arxiv.org/abs/2406.18682
Understanding the correlation between fast and ultrafast demagnetization processes is crucial for elucidating the microscopic mechanisms underlying ultrafast demagnetization, which is pivotal for various applications in spintronics. Initial theoretic
Externí odkaz:
http://arxiv.org/abs/2406.09620