Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Jenner, Erik"'
Autor:
Jenner, Erik, Kapur, Shreyas, Georgiev, Vasil, Allen, Cameron, Emmons, Scott, Russell, Stuart
Do neural networks learn to implement algorithms such as look-ahead or search "in the wild"? Or do they rely purely on collections of simple heuristics? We present evidence of learned look-ahead in the policy network of Leela Chess Zero, the currentl
Externí odkaz:
http://arxiv.org/abs/2406.00877
Large language models generate code one token at a time. Their autoregressive generation process lacks the feedback of observing the program's output. Training LLMs to suggest edits directly can be challenging due to the scarcity of rich edit data. T
Externí odkaz:
http://arxiv.org/abs/2405.20519
Autor:
Anwar, Usman, Saparov, Abulhair, Rando, Javier, Paleka, Daniel, Turpin, Miles, Hase, Peter, Lubana, Ekdeep Singh, Jenner, Erik, Casper, Stephen, Sourbut, Oliver, Edelman, Benjamin L., Zhang, Zhaowei, Günther, Mario, Korinek, Anton, Hernandez-Orallo, Jose, Hammond, Lewis, Bigelow, Eric, Pan, Alexander, Langosco, Lauro, Korbak, Tomasz, Zhang, Heidi, Zhong, Ruiqi, hÉigeartaigh, Seán Ó, Recchia, Gabriel, Corsi, Giulio, Chan, Alan, Anderljung, Markus, Edwards, Lilian, Petrov, Aleksandar, de Witt, Christian Schroeder, Motwan, Sumeet Ramesh, Bengio, Yoshua, Chen, Danqi, Torr, Philip H. S., Albanie, Samuel, Maharaj, Tegan, Foerster, Jakob, Tramer, Florian, He, He, Kasirzadeh, Atoosa, Choi, Yejin, Krueger, David
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods
Externí odkaz:
http://arxiv.org/abs/2404.09932
Past analyses of reinforcement learning from human feedback (RLHF) assume that the human evaluators fully observe the environment. What happens when human feedback is based only on partial observations? We formally define two failure cases: deceptive
Externí odkaz:
http://arxiv.org/abs/2402.17747
Autor:
Skalse, Joar, Farnik, Lucy, Motwani, Sumeet Ramesh, Jenner, Erik, Gleave, Adam, Abate, Alessandro
In order to solve a task using reinforcement learning, it is necessary to first formalise the goal of that task as a reward function. However, for many real-world tasks, it is very difficult to manually specify a reward function that never incentivis
Externí odkaz:
http://arxiv.org/abs/2309.15257
Autor:
Gleave, Adam, Taufeeque, Mohammad, Rocamonde, Juan, Jenner, Erik, Wang, Steven H., Toyer, Sam, Ernestus, Maximilian, Belrose, Nora, Emmons, Scott, Russell, Stuart
imitation provides open-source implementations of imitation and reward learning algorithms in PyTorch. We include three inverse reinforcement learning (IRL) algorithms, three imitation learning algorithms and a preference comparison algorithm. The im
Externí odkaz:
http://arxiv.org/abs/2211.11972
In reinforcement learning, different reward functions can be equivalent in terms of the optimal policies they induce. A particularly well-known and important example is potential shaping, a class of functions that can be added to any reward function
Externí odkaz:
http://arxiv.org/abs/2208.09570
Autor:
Jenner, Erik, Gleave, Adam
In many real-world applications, the reward function is too complex to be manually specified. In such cases, reward functions must instead be learned from human feedback. Since the learned reward may fail to represent user preferences, it is importan
Externí odkaz:
http://arxiv.org/abs/2203.13553
The minimum graph cut and minimum $s$-$t$-cut problems are important primitives in the modeling of combinatorial problems in computer science, including in computer vision and machine learning. Some of the most efficient algorithms for finding global
Externí odkaz:
http://arxiv.org/abs/2110.02750
Autor:
Jenner, Erik, Weiler, Maurice
Recent work in equivariant deep learning bears strong similarities to physics. Fields over a base space are fundamental entities in both subjects, as are equivariant maps between these fields. In deep learning, however, these maps are usually defined
Externí odkaz:
http://arxiv.org/abs/2106.10163