Zobrazeno 1 - 10
of 3 718
pro vyhledávání: '"P. Fellows"'
Autor:
Ellis, Benjamin, Jackson, Matthew T., Lupu, Andrei, Goldie, Alexander D., Fellows, Mattie, Whiteson, Shimon, Foerster, Jakob
In reinforcement learning (RL), it is common to apply techniques used broadly in machine learning such as neural network function approximators and momentum-based optimizers. However, such tools were largely developed for supervised learning rather t
Externí odkaz:
http://arxiv.org/abs/2412.17113
Autor:
Gallici, Matteo, Fellows, Mattie, Ellis, Benjamin, Pou, Bartomeu, Masmitja, Ivan, Foerster, Jakob Nicolaus, Martin, Mario
Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabil
Externí odkaz:
http://arxiv.org/abs/2407.04811
In many real-world settings, an agent must learn to act in environments where no reward signal can be specified, but a set of expert demonstrations is available. Imitation learning (IL) is a popular framework for learning policies from such demonstra
Externí odkaz:
http://arxiv.org/abs/2407.00495
Autor:
Fellows, Alexander P., Duque, Álvaro Díaz, Balos, Vasileios, Lehmann, Louis, Netz, Roland R., Wolf, Martin, Thämer, Martin
The air-water interface is a highly prevalent phase boundary with a far-reaching impact on natural and industrial processes. Water molecules behave differently at the interface compared to the bulk, exhibiting anisotropic orientational distributions,
Externí odkaz:
http://arxiv.org/abs/2404.12247
Autor:
Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob
In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax
Externí odkaz:
http://arxiv.org/abs/2402.12284
Since the lipid raft model was developed at the end of the last century, it became clear that the specific molecular arrangements of phospholipid assemblies within a membrane have profound implications in a vast range of physiological functions. Stud
Externí odkaz:
http://arxiv.org/abs/2312.05074
Autor:
Alonso La Rosa, Zachary Fellows, Andrew J. Wroe, Len Coutinho, Eduardo Pons, Nicole C. McAllister, Ranjini Tolakanahalli, Tugce Kutuk, Matthew D. Hall, Robert H. Press, Michael W. McDermott, Yazmin Odia, Manmeet S. Ahluwalia, Minesh P. Mehta, Alonso N. Gutierrez, Rupesh Kotecha
Publikováno v:
Scientific Reports, Vol 14, Iss 1, Pp 1-12 (2024)
Abstract Recurrent high-grade intracranial malignancies have a grim prognosis and uniform management guidelines are lacking. Re-irradiation is underused due to concerns about irreversible side effects. Pulsed-reduced dose rate radiotherapy (PRDR) aim
Externí odkaz:
https://doaj.org/article/38b5a8839f2442f29fe87abccf11e364
Autor:
Ajwani, Deepak, Bisseling, Rob H., Casel, Katrin, Çatalyürek, Ümit V., Chevalier, Cédric, Chudigiewitsch, Florian, Faraj, Marcelo Fonseca, Fellows, Michael, Gottesbüren, Lars, Heuer, Tobias, Karypis, George, Kaya, Kamer, Lacki, Jakub, Langguth, Johannes, Li, Xiaoye Sherry, Mayer, Ruben, Meintrup, Johannes, Mizutani, Yosuke, Pellegrini, François, Petrini, Fabrizio, Rosamond, Frances, Safro, Ilya, Schlag, Sebastian, Schulz, Christian, Sharma, Roohani, Strash, Darren, Sullivan, Blair D., Uçar, Bora, Yzelman, Albert-Jan
Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application c
Externí odkaz:
http://arxiv.org/abs/2310.11812
Autor:
Johnson, Jess A., Vaz, Amali, Montoya, Manny, Anugu, Narsireddy, Ard, Cameron, Carlson, Jared, Chapman, Kimberly, Durney, Olivier, Fellows, Chuck, Gardner, Andrew, Guyon, Olivier, Jannuzi, Buell, Jones, Ron, Kulesa, Craig, Long, Joseph, McEwen, Eden, Males, Jared, Mailhot, Emily, Sanchez, Jorge, Sivanandam, Suresh, Swanson, Robin, Taylor, Jacob, Vargas, Dan, West, Grant, Patience, Jennifer, Morzinski, Katie
The MMT Adaptive Optics exoPlanet Characterization System (MAPS) is a comprehensive update to the first generation MMT adaptive optics system (MMTAO), designed to produce a facility class suite of instruments whose purpose is to image nearby exoplane
Externí odkaz:
http://arxiv.org/abs/2309.14466
Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist methods. H
Externí odkaz:
http://arxiv.org/abs/2308.13049