Výsledky vyhledávání - "Foerster, Jakob"

Report

TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation

Autor: Cook, Jonathan, Rocktäschel, Tim, Foerster, Jakob, Aumiller, Dennis, Wang, Alex

Given the widespread adoption and usage of Large Language Models (LLMs), it is crucial to have flexible and interpretable evaluations of their instruction-following ability. Preference judgments between model outputs have become the de facto evaluati

Externí odkaz: http://arxiv.org/abs/2410.03608

Zobrazit plný text záznamu

Report

Opponent Shaping for Antibody Development

Autor: Towers, Sebastian, Kalisz, Aleksandra, Robert, Philippe A., Higueruelo, Alicia, Vianello, Francesca, Tsai, Ming-Han Chloe, Steel, Harrison, Foerster, Jakob N.

Anti-viral therapies are typically designed to target only the current strains of a virus. Game theoretically, this corresponds to a short-sighted, or myopic, response. However, therapy-induced selective pressures act on viruses to drive the emergenc

Externí odkaz: http://arxiv.org/abs/2409.10588

Zobrazit plný text záznamu

Report

Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

Autor: Lupidi, Alisia, Gemmell, Carlos, Cancedda, Nicola, Dwivedi-Yu, Jane, Weston, Jason, Foerster, Jakob, Raileanu, Roberta, Lomeli, Maria

Large Language Models still struggle in challenging scenarios that leverage structured data, complex reasoning, or tool usage. In this paper, we propose Source2Synth: a new method that can be used for teaching LLMs new skills without relying on costl

Externí odkaz: http://arxiv.org/abs/2409.08239

Zobrazit plný text záznamu

Report

JaxLife: An Open-Ended Agentic Simulator

Autor: Lu, Chris, Beukman, Michael, Matthews, Michael, Foerster, Jakob

Human intelligence emerged through the process of natural selection and evolution on Earth. We investigate what it would take to re-create this process in silico. While past work has often focused on low-level processes (such as simulating physics or

Externí odkaz: http://arxiv.org/abs/2409.00853

Zobrazit plný text záznamu

Report

No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery

Autor: Rutherford, Alexander, Beukman, Michael, Willi, Timon, Lacerda, Bruno, Hawes, Nick, Foerster, Jakob

What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning. In particular, Unsupervised Environment Design (UED) methods have gained recent attention as their

Externí odkaz: http://arxiv.org/abs/2408.15099

Zobrazit plný text záznamu

Report

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Autor: Zhang, Qizhen, Gritsch, Nikolas, Gnaneshwar, Dwaraknath, Guo, Simon, Cairuz, David, Venkitesh, Bharat, Foerster, Jakob, Blunsom, Phil, Ruder, Sebastian, Ustun, Ahmet, Locatelli, Acyr

The Mixture of Experts (MoE) framework has become a popular architecture for large language models due to its superior performance over dense models. However, training MoEs from scratch in a large-scale regime is prohibitively expensive. Existing met

Externí odkaz: http://arxiv.org/abs/2408.08274

Zobrazit plný text záznamu

Report

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Autor: Lu, Chris, Lu, Cong, Lange, Robert Tjarko, Foerster, Jakob, Clune, Jeff, Ha, David

One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used as aides to human scientists, e.g. for brainstor

Externí odkaz: http://arxiv.org/abs/2408.06292

Zobrazit plný text záznamu

Report

Can Learned Optimization Make Reinforcement Learning Less Difficult?

Autor: Goldie, Alexander David, Lu, Chris, Jackson, Matthew Thomas, Whiteson, Shimon, Foerster, Jakob Nicolaus

While reinforcement learning (RL) holds great potential for decision making in the real world, it suffers from a number of unique difficulties which often need specific consideration. In particular: it is highly non-stationary; suffers from high degr

Externí odkaz: http://arxiv.org/abs/2407.07082

Zobrazit plný text záznamu

Report

Simplifying Deep Temporal Difference Learning

Autor: Gallici, Matteo, Fellows, Mattie, Ellis, Benjamin, Pou, Bartomeu, Masmitja, Ivan, Foerster, Jakob Nicolaus, Martin, Mario

Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabil

Externí odkaz: http://arxiv.org/abs/2407.04811

Zobrazit plný text záznamu

Report

Mixture of Experts in a Mixture of RL settings

Autor: Willi, Timon, Obando-Ceron, Johan, Foerster, Jakob, Dziugaite, Karolina, Castro, Pablo Samuel

Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep

Externí odkaz: http://arxiv.org/abs/2406.18420

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání