Zobrazeno 1 - 10
of 5 144
pro vyhledávání: '"Cerón, P."'
The loss of plasticity in learning agents, analogous to the solidification of neural pathways in biological brains, significantly impedes learning and adaptation in reinforcement learning due to its non-stationary nature. To address this fundamental
Externí odkaz:
http://arxiv.org/abs/2410.07994
The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reaso
Externí odkaz:
http://arxiv.org/abs/2410.01930
Autor:
Vélez-Ceron, Ignasi, Coelho, Rodrigo C. V., Guillamat, Pau, da Gama, Margarida Telo, Sagués, Francesc, Ignés-Mullol, Jordi
Microfluidics involves the manipulation of flows at the microscale, typically requiring external power sources to generate pressure gradients. Alternatively, harnessing flows from active fluids, which are usually chaotic, has been proposed as a parad
Externí odkaz:
http://arxiv.org/abs/2407.09960
Autor:
Ardaševa, Aleksandra, Vélez-Cerón, Ignasi, Pedersen, Martin Cramer, Ignés-Mullol, Jordi, Sagués, Francesc, Doostmohammadi, Amin
We present a novel two-stage transition of the ordered active nematic state of a system of bundled microtubules into a biphasic active fluid. Specifically, we show that upon light-induced solidification of the underlying medium, microtubule-kinesin m
Externí odkaz:
http://arxiv.org/abs/2407.03723
Autor:
McAleese, Nat, Pokorny, Rai Michael, Uribe, Juan Felipe Ceron, Nitishinskaya, Evgenia, Trebacz, Maja, Leike, Jan
Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help human
Externí odkaz:
http://arxiv.org/abs/2407.00215
Autor:
Willi, Timon, Obando-Ceron, Johan, Foerster, Jakob, Dziugaite, Karolina, Castro, Pablo Samuel
Mixtures of Experts (MoEs) have gained prominence in (self-)supervised learning due to their enhanced inference efficiency, adaptability to distributed training, and modularity. Previous research has illustrated that MoEs can significantly boost Deep
Externí odkaz:
http://arxiv.org/abs/2406.18420
Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and careful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements bu
Externí odkaz:
http://arxiv.org/abs/2406.17523
The propagation of light beams in photovoltaic pyroelectric photorefractive crystals is modelled by a specific generalization of the nonlinear Schr\"odinger equation (GNLSE). We use the variational approximation (VA) to predict the propagation of sol
Externí odkaz:
http://arxiv.org/abs/2405.10445
Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific "worldview" and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanin
Externí odkaz:
http://arxiv.org/abs/2402.17649
Recent work has shown that deep reinforcement learning agents have difficulty in effectively using their network parameters. We leverage prior insights into the advantages of sparse training techniques and demonstrate that gradual magnitude pruning e
Externí odkaz:
http://arxiv.org/abs/2402.12479