Výsledky vyhledávání - "Castro, Pablo A."

Report

Error Analysis of Sum-Product Algorithms under Stochastic Rounding

Autor: Castro, Pablo de Oliveira, Arar, El-Mehdi El, Petit, Eric, Sohier, Devan

The quality of numerical computations can be measured through their forward error, for which finding good error bounds is challenging in general. For several algorithms and using stochastic rounding (SR), probabilistic analysis has been shown to be a

Externí odkaz: http://arxiv.org/abs/2411.13601

Zobrazit plný text záznamu

Report

CALE: Continuous Arcade Learning Environment

Autor: Farebrother, Jesse, Castro, Pablo Samuel

We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds s

Externí odkaz: http://arxiv.org/abs/2410.23810

Zobrazit plný text záznamu

Report

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Autor: Sokar, Ghada, Obando-Ceron, Johan, Courville, Aaron, Larochelle, Hugo, Castro, Pablo Samuel

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reaso

Externí odkaz: http://arxiv.org/abs/2410.01930

Zobrazit plný text záznamu

Report

Effects of kinetic energy on heat fluctuations of passive and active overdamped driven particles

Autor: Paraguassú, Pedro V., Aquino, Rui, de Castro, Pablo

To describe the spatial trajectory of an overdamped Brownian particle, inertial effects can be neglected. Yet, at the energetic level of stochastic thermodynamics, changes in kinetic energy must be considered to accurately predict the heat exchanged

Externí odkaz: http://arxiv.org/abs/2408.16104

Zobrazit plný text záznamu

Report

NAVIX: Scaling MiniGrid Environments with JAX

Autor: Pignatelli, Eduardo, Liesen, Jarek, Lange, Robert Tjarko, Lu, Chris, Castro, Pablo Samuel, Toni, Laura

As Deep Reinforcement Learning (Deep RL) research moves towards solving large-scale worlds, efficient environment simulations become crucial for rapid experimentation. However, most existing environments struggle to scale to high throughput, setting

Externí odkaz: http://arxiv.org/abs/2407.19396

Zobrazit plný text záznamu

Report

Verificarlo CI: continuous integration for numerical optimization and debugging

Autor: Delval, Aurélien, Coppens, François, Petit, Eric, Iakymchuk, Roman, Castro, Pablo de Oliveira

Floating-point accuracy is an important concern when developing numerical simulations or other compute-intensive codes. Tracking the introduction of numerical regression is often delayed until it provokes unexpected bug for the end-user. In this pape

Externí odkaz: http://arxiv.org/abs/2407.08262

Zobrazit plný text záznamu

Report

Mixture of Experts in a Mixture of RL settings

Autor: Willi, Timon, Obando-Ceron, Johan, Foerster, Jakob, Dziugaite, Karolina, Castro, Pablo Samuel

Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep

Externí odkaz: http://arxiv.org/abs/2406.18420

Zobrazit plný text záznamu

Report

On the consistency of hyper-parameter selection in value-based deep reinforcement learning

Autor: Obando-Ceron, Johan, Araújo, João G. M., Courville, Aaron, Castro, Pablo Samuel

Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and careful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements bu

Externí odkaz: http://arxiv.org/abs/2406.17523

Zobrazit plný text záznamu

Report

Enabling mixed-precision with the help of tools: A Nekbone case study

Autor: Chen, Yanxiang, Castro, Pablo de Oliveira, Bientinesi, Paolo, Iakymchuk, Roman

Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we consider Nekbone, a mini-application for the CFD s

Externí odkaz: http://arxiv.org/abs/2405.11065

Zobrazit plný text záznamu

Report

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Autor: Farebrother, Jesse, Orbay, Jordi, Vuong, Quan, Taïga, Adrien Ali, Chebotar, Yevgen, Xiao, Ted, Irpan, Alex, Levine, Sergey, Castro, Pablo Samuel, Faust, Aleksandra, Kumar, Aviral, Agarwal, Rishabh

Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained using a mean squared error regression objective to match bootstrapped target values. However, scaling value-ba

Externí odkaz: http://arxiv.org/abs/2403.03950

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání