Výsledky vyhledávání

Report

Neuroplastic Expansion in Deep Reinforcement Learning

Autor: Liu, Jiashun, Obando-Ceron, Johan, Courville, Aaron, Pan, Ling

The loss of plasticity in learning agents, analogous to the solidification of neural pathways in biological brains, significantly impedes learning and adaptation in reinforcement learning due to its non-stationary nature. To address this fundamental

Externí odkaz: http://arxiv.org/abs/2410.07994

Zobrazit plný text záznamu

Report

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Autor: Sokar, Ghada, Obando-Ceron, Johan, Courville, Aaron, Larochelle, Hugo, Castro, Pablo Samuel

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reaso

Externí odkaz: http://arxiv.org/abs/2410.01930

Zobrazit plný text záznamu

Report

Active nematic pumps

Autor: Vélez-Ceron, Ignasi, Coelho, Rodrigo C. V., Guillamat, Pau, da Gama, Margarida Telo, Sagués, Francesc, Ignés-Mullol, Jordi

Microfluidics involves the manipulation of flows at the microscale, typically requiring external power sources to generate pressure gradients. Alternatively, harnessing flows from active fluids, which are usually chaotic, has been proposed as a parad

Externí odkaz: http://arxiv.org/abs/2407.09960

Zobrazit plný text záznamu

Report

Collapse of active nematic order through a two-stage dynamic transition

Autor: Ardaševa, Aleksandra, Vélez-Cerón, Ignasi, Pedersen, Martin Cramer, Ignés-Mullol, Jordi, Sagués, Francesc, Doostmohammadi, Amin

We present a novel two-stage transition of the ordered active nematic state of a system of bundled microtubules into a biphasic active fluid. Specifically, we show that upon light-induced solidification of the underlying medium, microtubule-kinesin m

Externí odkaz: http://arxiv.org/abs/2407.03723

Zobrazit plný text záznamu

Report

LLM Critics Help Catch LLM Bugs

Autor: McAleese, Nat, Pokorny, Rai Michael, Uribe, Juan Felipe Ceron, Nitishinskaya, Evgenia, Trebacz, Maja, Leike, Jan

Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help human

Externí odkaz: http://arxiv.org/abs/2407.00215

Zobrazit plný text záznamu

Report

Mixture of Experts in a Mixture of RL settings

Autor: Willi, Timon, Obando-Ceron, Johan, Foerster, Jakob, Dziugaite, Karolina, Castro, Pablo Samuel

Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep

Externí odkaz: http://arxiv.org/abs/2406.18420

Zobrazit plný text záznamu

Report

On the consistency of hyper-parameter selection in value-based deep reinforcement learning

Autor: Obando-Ceron, Johan, Araújo, João G. M., Courville, Aaron, Castro, Pablo Samuel

Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and careful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements bu

Externí odkaz: http://arxiv.org/abs/2406.17523

Zobrazit plný text záznamu

Report

Regular, beating and dilogarithmic breathers in biased photorefractive crystals

Autor: Betancur-Silvera, Carlos Alberto, Espinosa-Ceron, Aurea, Malomed, Boris A., Fujioka, Jorge

The propagation of light beams in photovoltaic pyroelectric photorefractive crystals is modelled by a specific generalization of the nonlinear Schr\"odinger equation (GNLSE). We use the variational approximation (VA) to predict the propagation of sol

Externí odkaz: http://arxiv.org/abs/2405.10445

Zobrazit plný text záznamu

Report

Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs

Autor: Ceron, Tanise, Falk, Neele, Barić, Ana, Nikolaev, Dmitry, Padó, Sebastian

Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific "worldview" and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanin

Externí odkaz: http://arxiv.org/abs/2402.17649

Zobrazit plný text záznamu

Report

In value-based deep reinforcement learning, a pruned network is a good network

Autor: Obando-Ceron, Johan, Courville, Aaron, Castro, Pablo Samuel

Recent work has shown that deep reinforcement learning agents have difficulty in effectively using their network parameters. We leverage prior insights into the advantages of sparse training techniques and demonstrate that gradual magnitude pruning e

Externí odkaz: http://arxiv.org/abs/2402.12479

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání