Výsledky vyhledávání

Report

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

Autor: Ellis, Benjamin, Jackson, Matthew T., Lupu, Andrei, Goldie, Alexander D., Fellows, Mattie, Whiteson, Shimon, Foerster, Jakob

In reinforcement learning (RL), it is common to apply techniques used broadly in machine learning such as neural network function approximators and momentum-based optimizers. However, such tools were largely developed for supervised learning rather t

Externí odkaz: http://arxiv.org/abs/2412.17113

Zobrazit plný text záznamu

Report

Simplifying Deep Temporal Difference Learning

Autor: Gallici, Matteo, Fellows, Mattie, Ellis, Benjamin, Pou, Bartomeu, Masmitja, Ivan, Foerster, Jakob Nicolaus, Martin, Mario

Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabil

Externí odkaz: http://arxiv.org/abs/2407.04811

Zobrazit plný text záznamu

Report

A Bayesian Solution To The Imitation Gap

Autor: Vuorio, Risto, Fellows, Mattie, Lu, Cong, Grislain, Clémence, Whiteson, Shimon

In many real-world settings, an agent must learn to act in environments where no reward signal can be specified, but a set of expert demonstrations is available. Imitation learning (IL) is a popular framework for learning policies from such demonstra

Externí odkaz: http://arxiv.org/abs/2407.00495

Zobrazit plný text záznamu

Report

How Thick is the Air-Water Interface? -- A Direct Experimental Measurement of the Decay Length of the Interfacial Structural Anisotropy

Autor: Fellows, Alexander P., Duque, Álvaro Díaz, Balos, Vasileios, Lehmann, Louis, Netz, Roland R., Wolf, Martin, Thämer, Martin

The air-water interface is a highly prevalent phase boundary with a far-reaching impact on natural and industrial processes. Water molecules behave differently at the interface compared to the bulk, exhibiting anisotropic orientational distributions,

Externí odkaz: http://arxiv.org/abs/2404.12247

Zobrazit plný text záznamu

Report

Refining Minimax Regret for Unsupervised Environment Design

Autor: Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob

In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax

Externí odkaz: http://arxiv.org/abs/2402.12284

Zobrazit plný text záznamu

Report

Spiralling molecular structures and chiral selectivity in model membranes

Autor: Fellows, Alexander P., John, Ben, Wolf, Martin, Thämer, Martin

Since the lipid raft model was developed at the end of the last century, it became clear that the specific molecular arrangements of phospholipid assemblies within a membrane have profound implications in a vast range of physiological functions. Stud

Externí odkaz: http://arxiv.org/abs/2312.05074

Zobrazit plný text záznamu

Akademický článek

Initial feasibility cohort of temporally modulated pulsed proton re-irradiation (TMPPR) for recurrent high-grade intracranial malignancies

Autor: Alonso La Rosa, Zachary Fellows, Andrew J. Wroe, Len Coutinho, Eduardo Pons, Nicole C. McAllister, Ranjini Tolakanahalli, Tugce Kutuk, Matthew D. Hall, Robert H. Press, Michael W. McDermott, Yazmin Odia, Manmeet S. Ahluwalia, Minesh P. Mehta, Alonso N. Gutierrez, Rupesh Kotecha

Publikováno v: Scientific Reports, Vol 14, Iss 1, Pp 1-12 (2024)

Abstract Recurrent high-grade intracranial malignancies have a grim prognosis and uniform management guidelines are lacking. Re-irradiation is underused due to concerns about irreversible side effects. Pulsed-reduced dose rate radiotherapy (PRDR) aim

Externí odkaz: https://doaj.org/article/38b5a8839f2442f29fe87abccf11e364

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Open Problems in (Hyper)Graph Decomposition

Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application c

Externí odkaz: http://arxiv.org/abs/2310.11812

Zobrazit plný text záznamu

Report

The MAPS Adaptive Secondary Mirror: First Light, Laboratory Work, and Achievements

The MMT Adaptive Optics exoPlanet Characterization System (MAPS) is a comprehensive update to the first generation MMT adaptive optics system (MMTAO), designed to produce a facility class suite of instruments whose purpose is to image nearby exoplane

Externí odkaz: http://arxiv.org/abs/2309.14466

Zobrazit plný text záznamu

Report

Bayesian Exploration Networks

Autor: Fellows, Mattie, Kaplowitz, Brandon, de Witt, Christian Schroeder, Whiteson, Shimon

Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist methods. H

Externí odkaz: http://arxiv.org/abs/2308.13049

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání