Výsledky vyhledávání

Report

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

Autor: Ellis, Benjamin, Jackson, Matthew T., Lupu, Andrei, Goldie, Alexander D., Fellows, Mattie, Whiteson, Shimon, Foerster, Jakob

In reinforcement learning (RL), it is common to apply techniques used broadly in machine learning such as neural network function approximators and momentum-based optimizers. However, such tools were largely developed for supervised learning rather t

Externí odkaz: http://arxiv.org/abs/2412.17113

Zobrazit plný text záznamu

Report

Simplifying Deep Temporal Difference Learning

Autor: Gallici, Matteo, Fellows, Mattie, Ellis, Benjamin, Pou, Bartomeu, Masmitja, Ivan, Foerster, Jakob Nicolaus, Martin, Mario

Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabil

Externí odkaz: http://arxiv.org/abs/2407.04811

Zobrazit plný text záznamu

Report

A Bayesian Solution To The Imitation Gap

Autor: Vuorio, Risto, Fellows, Mattie, Lu, Cong, Grislain, Clémence, Whiteson, Shimon

In many real-world settings, an agent must learn to act in environments where no reward signal can be specified, but a set of expert demonstrations is available. Imitation learning (IL) is a popular framework for learning policies from such demonstra

Externí odkaz: http://arxiv.org/abs/2407.00495

Zobrazit plný text záznamu

Report

How Thick is the Air-Water Interface? -- A Direct Experimental Measurement of the Decay Length of the Interfacial Structural Anisotropy

Autor: Fellows, Alexander P., Duque, Álvaro Díaz, Balos, Vasileios, Lehmann, Louis, Netz, Roland R., Wolf, Martin, Thämer, Martin

The air-water interface is a highly prevalent phase boundary with a far-reaching impact on natural and industrial processes. Water molecules behave differently at the interface compared to the bulk, exhibiting anisotropic orientational distributions,

Externí odkaz: http://arxiv.org/abs/2404.12247

Zobrazit plný text záznamu

Report

Refining Minimax Regret for Unsupervised Environment Design

Autor: Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob

In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax

Externí odkaz: http://arxiv.org/abs/2402.12284

Zobrazit plný text záznamu

Kniha

Food Processing Technology : Principles and Practice. [elektronicky zdroj]

Autor: Fellows, P. J.

Externí odkaz: Kolekce e-knih KNAV (Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.)

Report

Spiralling molecular structures and chiral selectivity in model membranes

Autor: Fellows, Alexander P., John, Ben, Wolf, Martin, Thämer, Martin

Since the lipid raft model was developed at the end of the last century, it became clear that the specific molecular arrangements of phospholipid assemblies within a membrane have profound implications in a vast range of physiological functions. Stud

Externí odkaz: http://arxiv.org/abs/2312.05074

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání