Zobrazeno 1 - 10
of 1 285
pro vyhledávání: '"Castro, Pablo A."'
The quality of numerical computations can be measured through their forward error, for which finding good error bounds is challenging in general. For several algorithms and using stochastic rounding (SR), probabilistic analysis has been shown to be a
Externí odkaz:
http://arxiv.org/abs/2411.13601
We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds s
Externí odkaz:
http://arxiv.org/abs/2410.23810
The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reaso
Externí odkaz:
http://arxiv.org/abs/2410.01930
To describe the spatial trajectory of an overdamped Brownian particle, inertial effects can be neglected. Yet, at the energetic level of stochastic thermodynamics, changes in kinetic energy must be considered to accurately predict the heat exchanged
Externí odkaz:
http://arxiv.org/abs/2408.16104
Autor:
Pignatelli, Eduardo, Liesen, Jarek, Lange, Robert Tjarko, Lu, Chris, Castro, Pablo Samuel, Toni, Laura
As Deep Reinforcement Learning (Deep RL) research moves towards solving large-scale worlds, efficient environment simulations become crucial for rapid experimentation. However, most existing environments struggle to scale to high throughput, setting
Externí odkaz:
http://arxiv.org/abs/2407.19396
Autor:
Delval, Aurélien, Coppens, François, Petit, Eric, Iakymchuk, Roman, Castro, Pablo de Oliveira
Floating-point accuracy is an important concern when developing numerical simulations or other compute-intensive codes. Tracking the introduction of numerical regression is often delayed until it provokes unexpected bug for the end-user. In this pape
Externí odkaz:
http://arxiv.org/abs/2407.08262
Autor:
Willi, Timon, Obando-Ceron, Johan, Foerster, Jakob, Dziugaite, Karolina, Castro, Pablo Samuel
Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep
Externí odkaz:
http://arxiv.org/abs/2406.18420
Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and careful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements bu
Externí odkaz:
http://arxiv.org/abs/2406.17523
Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we consider Nekbone, a mini-application for the CFD s
Externí odkaz:
http://arxiv.org/abs/2405.11065
Autor:
Farebrother, Jesse, Orbay, Jordi, Vuong, Quan, Taïga, Adrien Ali, Chebotar, Yevgen, Xiao, Ted, Irpan, Alex, Levine, Sergey, Castro, Pablo Samuel, Faust, Aleksandra, Kumar, Aviral, Agarwal, Rishabh
Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained using a mean squared error regression objective to match bootstrapped target values. However, scaling value-ba
Externí odkaz:
http://arxiv.org/abs/2403.03950