Zobrazeno 1 - 10
of 14
pro vyhledávání: '"Celaya Llover, Enric"'
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Technical report Collocation methods for optimal control commonly assume that the system dynamics is expressed as a first order ODE of the form dx/dt = f(x, u, t), where x is the state and u the control vector. However, in many cases, the dynamics in
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::e625beadb5c1ace8248f2b5be1441633
https://hdl.handle.net/2117/366522
https://hdl.handle.net/2117/366522
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Politècnica de Catalunya (UPC)
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::b8d6e812458dda13efe5a6c7b0840589
http://hdl.handle.net/2117/353547
http://hdl.handle.net/2117/353547
Publikováno v:
Recercat. Dipósit de la Recerca de Catalunya
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::d2ef8fba972722f8e6a3b1ea43da5b08
https://hdl.handle.net/2117/28454
https://hdl.handle.net/2117/28454
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
IRI Technical Report In this work we explain how the stochastic approximation of the average of a random variable is carried out when the observations used in the updates consist in proportion of samples rather than complete samples.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::9bbf15f9e132efa442ac2c1fc85bedce
http://hdl.handle.net/2117/14112
http://hdl.handle.net/2117/14112
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::0c15f57c0aa5ea12abfe3b9d10fa5ff0
http://hdl.handle.net/2117/10368
http://hdl.handle.net/2117/10368
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
Universitat Politècnica de Catalunya (UPC)
Recercat. Dipósit de la Recerca de Catalunya
instname
Performing Q-Learning in continuous state-action spaces is a problem still unsolved for many complex applications. The Q function may be rather complex and can not be expected to fit into a predefined parametric model. In addition, the function appro
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::e15af1576ead99aaf81182948b8296e4
https://hdl.handle.net/2117/6856
https://hdl.handle.net/2117/6856
Autor:
Agostini, Alejandro Gabriel, Celaya Llover, Enric|||0000-0001-8480-7706, Torras, Carme|||0000-0002-2933-398X, Wörgötter, Florentin
Publikováno v:
Recercat. Dipósit de la Recerca de Catalunya
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
In this work we propose a learning system to learn on-line an action policy coded in rules using natural human instructions about cause-effect relations in currently observed situations. The instructions only on currently observed situations avoid co
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::b4f25f9b68556db9f49ee8eb3b167595
https://hdl.handle.net/2117/2692
https://hdl.handle.net/2117/2692
Autor:
Agostini, Alejandro Gabriel, Celaya Llover, Enric|||0000-0001-8480-7706, Torras, Carme|||0000-0002-2933-398X, Wörgötter, Florentin
Publikováno v:
Digital.CSIC. Repositorio Institucional del CSIC
instname
Recercat. Dipósit de la Recerca de Catalunya
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
instname
Recercat. Dipósit de la Recerca de Catalunya
Universitat Jaume I
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Presentado a la International Conference on Cognitive Systems celebrada en Karlsruhe (Alemania) del 2 al 4 de abril de 2008.
In this work we propose a decision-making system that efficiently learns behaviors in the form of rules using natural hu
In this work we propose a decision-making system that efficiently learns behaviors in the form of rules using natural hu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::09778f687bed18155258a09c3b2a9b3c
http://hdl.handle.net/10261/30350
http://hdl.handle.net/10261/30350
Autor:
Rodríguez Tsouroukdissian, Adolfo, Basañez Villaluenga, Luis|||0000-0002-5599-1636, Celaya Llover, Enric|||0000-0001-8480-7706
Publikováno v:
Recercat. Dipósit de la Recerca de Catalunya
instname
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
instname
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
This paper presents a relational positioning methodology for flexibly and intuitively specifying offline programmed robot tasks, as well as for assisting the execution of teleoperated tasks demanding precise movements. In relational positioning, the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::27803097ef8b58f8a49d117688c9d9e1
https://hdl.handle.net/2117/1531
https://hdl.handle.net/2117/1531
Publikováno v:
Recercat. Dipósit de la Recerca de Catalunya
instname
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
instname
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
In this work we propose an approach for generalization in continuous domain Reinforcement Learning that, instead of using a single function approximator, tries many different function approximators in parallel, each one defined in a different region
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::1d64da0948781b37bb134d1c38cf7b26
http://hdl.handle.net/2117/14123
http://hdl.handle.net/2117/14123