Zobrazeno 1 - 10
of 5 554
pro vyhledávání: '"ALBERTO, MARIA"'
Autor:
Genoni, Matteo, Dekker, Hans, Covino, Stefano, Cirami, Roberto, Scalera, Marcello Agostino, Bissel, Lawrence, Seifert, Walter, Calcines, Ariadna, Avila, Gerardo, Stuermer, Julian, Ritz, Christopher, Lunney, David, Miller, Chris, Watson, Stephen, Waring, Chris, Castilho, Bruno Vaz, De Arruda, Marcio, Verducci, Orlando, Coretti, Igor, Oggioni, Luca, Pariani, Giorgio, Redaelli, Edoardo Alberto Maria, D'Ambrogio, Matteo, Calderone, Giorgio, Porru, Matteo, Stilz, Ingo, Smiljanic, Rodolfo, Cupani, Guido, Franchini, Mariagrazia, Scaudo, Andrea, Geers, Vincent, De Caprio, Vincenzo, Auria, Domenico D', Sibalic, Mina, Opitom, Cyrielle, Cescutti, Gabriele, Odorico, Valentina D', Janssen, Ruben Sanchez, Quirrenbach, Andreas, Barbuy, Beatriz, Cristiani, Stefano, Di Marcantonio, Paolo
Publikováno v:
Proceedings Volume 13096, Ground-based and Airborne Instrumentation for Astronomy X; 130967T (2024)
In the era of Extremely Large Telescopes, the current generation of 8-10m facilities are likely to remain competitive at ground-UV wavelengths for the foreseeable future. The Cassegrain U-Band Efficient Spectrograph (CUBES) has been designed to provi
Externí odkaz:
http://arxiv.org/abs/2412.03460
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems. However, the complexity of exploring vast policy spaces can lead to significant inefficiencies. Re
Externí odkaz:
http://arxiv.org/abs/2411.09900
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e. those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. $arm$). We study a particular case of the rested bandits i
Externí odkaz:
http://arxiv.org/abs/2411.14446
Achieving the no-regret property for Reinforcement Learning (RL) problems in continuous state and action-space environments is one of the major open problems in the field. Existing solutions either work under very specific assumptions or achieve boun
Externí odkaz:
http://arxiv.org/abs/2410.24071
Policy evaluation via Monte Carlo (MC) simulation is at the core of many MC Reinforcement Learning (RL) algorithms (e.g., policy gradient methods). In this context, the designer of the learning system specifies an interaction budget that the agent us
Externí odkaz:
http://arxiv.org/abs/2410.13463
Dealing with Partially Observable Markov Decision Processes is notably a challenging task. We face an average-reward infinite-horizon POMDP setting with an unknown transition model, where we assume the knowledge of the observation model. Under this a
Externí odkaz:
http://arxiv.org/abs/2410.01331
Our goal is to extract useful knowledge from demonstrations of behavior in sequential decision-making problems. Although it is well-known that humans commonly engage in risk-sensitive behaviors in the presence of stochasticity, most Inverse Reinforce
Externí odkaz:
http://arxiv.org/abs/2409.17355
Autor:
Genalti, Gianmarco, Mussi, Marco, Gatti, Nicola, Restelli, Marcello, Castiglioni, Matteo, Metelli, Alberto Maria
Rested and Restless Bandits are two well-known bandit settings that are useful to model real-world sequential decision-making problems in which the expected reward of an arm evolves over time due to the actions we perform or due to the nature. In thi
Externí odkaz:
http://arxiv.org/abs/2409.05980
$\textit{Restless Bandits}$ describe sequential decision-making problems in which the rewards evolve with time independently from the actions taken by the policy-maker. It has been shown that classical Bandit algorithms fail when the underlying envir
Externí odkaz:
http://arxiv.org/abs/2409.05181
The increase of renewable energy generation towards the zero-emission target is making the problem of controlling power grids more and more challenging. The recent series of competitions Learning To Run a Power Network (L2RPN) have encouraged the use
Externí odkaz:
http://arxiv.org/abs/2409.04467