Zobrazeno 1 - 10
of 3 258
pro vyhledávání: '"Restelli A."'
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems. However, the complexity of exploring vast policy spaces can lead to significant inefficiencies. Re
Externí odkaz:
http://arxiv.org/abs/2411.09900
Autor:
Liu, Puze, Günster, Jonas, Funk, Niklas, Gröger, Simon, Chen, Dong, Bou-Ammar, Haitham, Jankowski, Julius, Marić, Ante, Calinon, Sylvain, Orsula, Andrej, Olivares-Mendez, Miguel, Zhou, Hongyi, Lioutikov, Rudolf, Neumann, Gerhard, Zhalehmehrabi, Amarildo Likmeta Amirhossein, Bonenfant, Thomas, Restelli, Marcello, Tateo, Davide, Liu, Ziyuan, Peters, Jan
Machine learning methods have a groundbreaking impact in many application domains, but their application on real robotic platforms is still limited. Despite the many challenges associated with combining machine learning technology with robotics, robo
Externí odkaz:
http://arxiv.org/abs/2411.05718
Achieving the no-regret property for Reinforcement Learning (RL) problems in continuous state and action-space environments is one of the major open problems in the field. Existing solutions either work under very specific assumptions or achieve boun
Externí odkaz:
http://arxiv.org/abs/2410.24071
Policy evaluation via Monte Carlo (MC) simulation is at the core of many MC Reinforcement Learning (RL) algorithms (e.g., policy gradient methods). In this context, the designer of the learning system specifies an interaction budget that the agent us
Externí odkaz:
http://arxiv.org/abs/2410.13463
Autor:
Monaco, Vito Alessandro, Riva, Antonio, Sabbioni, Luca, Bisi, Lorenzo, Vittori, Edoardo, Pinciroli, Marco, Trapletti, Michele, Restelli, Marcello
In recent years, the popularity of artificial intelligence has surged due to its widespread application in various fields. The financial sector has harnessed its advantages for multiple purposes, including the development of automated trading systems
Externí odkaz:
http://arxiv.org/abs/2410.23294
Dealing with Partially Observable Markov Decision Processes is notably a challenging task. We face an average-reward infinite-horizon POMDP setting with an unknown transition model, where we assume the knowledge of the observation model. Under this a
Externí odkaz:
http://arxiv.org/abs/2410.01331
Autor:
Tu, J., Restelli, A., Tsui, T. -C., Weber, K., Spielman, I. B., Rolston, S. L., Porto, J. V., Subhankar, S.
The Pound-Drever-Hall (PDH) technique is routinely used to stabilize the frequency of a laser to a reference cavity. The electronic sideband (ESB) locking scheme, a PDH variant, helps bridge the frequency difference between the quantized frequencies
Externí odkaz:
http://arxiv.org/abs/2409.08764
Autor:
Genalti, Gianmarco, Mussi, Marco, Gatti, Nicola, Restelli, Marcello, Castiglioni, Matteo, Metelli, Alberto Maria
Rested and Restless Bandits are two well-known bandit settings that are useful to model real-world sequential decision-making problems in which the expected reward of an arm evolves over time due to the actions we perform or due to the nature. In thi
Externí odkaz:
http://arxiv.org/abs/2409.05980
The increase of renewable energy generation towards the zero-emission target is making the problem of controlling power grids more and more challenging. The recent series of competitions Learning To Run a Power Network (L2RPN) have encouraged the use
Externí odkaz:
http://arxiv.org/abs/2409.04467
Hierarchical Reinforcement Learning (HRL) approaches have shown successful results in solving a large variety of complex, structured, long-horizon problems. Nevertheless, a full theoretical understanding of this empirical evidence is currently missin
Externí odkaz:
http://arxiv.org/abs/2406.15124