Výsledky vyhledávání

Report

Threading in star catenanes: The role of ring rigidity, topology and environmental crowding

This study investigates the probability of threading in star catenanes under good solvent conditions using molecular dynamics simulations, emphasizing the influence of ring rigidity. Threading in these systems arises from the interplay between the in

Externí odkaz: http://arxiv.org/abs/2412.07860

Zobrazit plný text záznamu

Report

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

We introduce the Aya Expanse model family, a new generation of 8B and 32B parameter multilingual language models, aiming to address the critical challenge of developing highly performant multilingual models that match or surpass the capabilities of m

Externí odkaz: http://arxiv.org/abs/2412.04261

Zobrazit plný text záznamu

Report

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Autor: Khalifa, Muhammad, Tan, Yi-Chern, Ahmadian, Arash, Hosking, Tom, Lee, Honglak, Wang, Lu, Üstün, Ahmet, Sherborne, Tom, Gallé, Matthias

Model merging has shown great promise at combining expert models, but the benefit of merging is unclear when merging ``generalist'' models trained on many tasks. We explore merging in the context of large (~100B) models, by recycling checkpoints that

Externí odkaz: http://arxiv.org/abs/2412.04144

Zobrazit plný text záznamu

Report

Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

Autor: Aakanksha, Ahmadian, Arash, Goldfarb-Tarrant, Seraphina, Ermis, Beyza, Fadaee, Marzieh, Hooker, Sara

Large Language Models (LLMs) have been adopted and deployed worldwide for a broad variety of applications. However, ensuring their safe use remains a significant challenge. Preference training and safety measures often overfit to harms prevalent in W

Externí odkaz: http://arxiv.org/abs/2410.10801

Zobrazit plný text záznamu

Report

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

Autor: Yu, Simon, Chen, Liangyu, Ahmadian, Sara, Fadaee, Marzieh

Finetuning large language models on instruction data is crucial for enhancing pre-trained knowledge and improving instruction-following capabilities. As instruction datasets proliferate, selecting optimal data for effective training becomes increasin

Externí odkaz: http://arxiv.org/abs/2409.11378

Zobrazit plný text záznamu

Report

A New Perspective to Fish Trajectory Imputation: A Methodology for Spatiotemporal Modeling of Acoustically Tagged Fish Data

Autor: Ahmadian, Mahshid, Boone, Edward L., Chiu, Grace S.

The focus of this paper is a key component of a methodology for understanding, interpolating, and predicting fish movement patterns based on spatiotemporal data recorded by spatially static acoustic receivers. For periods of time, fish may be far fro

Externí odkaz: http://arxiv.org/abs/2408.13220

Zobrazit plný text záznamu

Report

Superior Scoring Rules for Probabilistic Evaluation of Single-Label Multi-Class Classification Tasks

Autor: Ahmadian, Rouhollah, Ghatee, Mehdi, Wahlström, Johan

This study introduces novel superior scoring rules called Penalized Brier Score (PBS) and Penalized Logarithmic Loss (PLL) to improve model evaluation for probabilistic classification. Traditional scoring rules like Brier Score and Logarithmic Loss s

Externí odkaz: http://arxiv.org/abs/2407.17697

Zobrazit plný text záznamu

Report

RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs

Autor: Dang, John, Ahmadian, Arash, Marchisio, Kelly, Kreutzer, Julia, Üstün, Ahmet, Hooker, Sara

Preference optimization techniques have become a standard final stage for training state-of-art large language models (LLMs). However, despite widespread adoption, the vast majority of work to-date has focused on first-class citizen languages like En

Externí odkaz: http://arxiv.org/abs/2407.02552

Zobrazit plný text záznamu

Report

Averaging log-likelihoods in direct alignment

Autor: Grinsztajn, Nathan, Flet-Berliac, Yannis, Azar, Mohammad Gheshlaghi, Strub, Florian, Wu, Bill, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Pietquin, Olivier, Geist, Matthieu

To better align Large Language Models (LLMs) with human judgment, Reinforcement Learning from Human Feedback (RLHF) learns a reward model and then optimizes it using regularized RL. Recently, direct alignment methods were introduced to learn such a f

Externí odkaz: http://arxiv.org/abs/2406.19188

Zobrazit plný text záznamu

Report

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Autor: Flet-Berliac, Yannis, Grinsztajn, Nathan, Strub, Florian, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Azar, Mohammad Gheshlaghi, Pietquin, Olivier, Geist, Matthieu

Reinforcement Learning (RL) has been used to finetune Large Language Models (LLMs) using a reward model trained from preference data, to better align with human judgment. The recently introduced direct alignment methods, which are often simpler, more

Externí odkaz: http://arxiv.org/abs/2406.19185

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání