Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Harm van Seijen"'
Publikováno v:
Proceedings of the AAAI Conference on Artificial Intelligence. 32
In Reinforcement Learning, an intelligent agent has to make a sequence of decisions to accomplish a goal. If this sequence is long, then the agent has to plan over a long horizon. While learning the optimal policy and its value function is a well stu
Publikováno v:
Computational Intelligence. 30:657-699
This article addresses reinforcement learning problems based on factored Markov decision processes MDPs in which the agent must choose among a set of candidate abstractions, each build up from a different combination of state components. We present a
Publikováno v:
Journal of Machine Learning Research, 12, 2045-2094. Microtome Publishing
University of Groningen
Journal of Machine Learning Research, June, 12, 2045-2094
Journal of Machine Learning Research, 12(Jun), 2045-2094
Journal of Machine Learning Research, 12, 2045-2094
University of Groningen
Journal of Machine Learning Research, June, 12, 2045-2094
Journal of Machine Learning Research, 12(Jun), 2045-2094
Journal of Machine Learning Research, 12, 2045-2094
This article presents and evaluates best-match learning, a new approach to reinforcement learning that trades off the sample efficiency of model-based methods with the space efficiency of model-free methods. Best-match learning works by approximating
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::7f1843b3e6684827e53786b0a9b42ac3
https://dare.uva.nl/personal/pure/en/publications/exploiting-bestmatch-equations-for-efficient-reinforcement-learning(dfece862-a70f-4d0a-91eb-a0a9377ca1f4).html
https://dare.uva.nl/personal/pure/en/publications/exploiting-bestmatch-equations-for-efficient-reinforcement-learning(dfece862-a70f-4d0a-91eb-a0a9377ca1f4).html
Publikováno v:
SPIE Proceedings.
Band selection is essential in the design of multispectral sensor systems. This paper describes the TNO hyperspectral band selection tool HYBASE. It calculates the optimum band positions given the number of bands and the width of the spectral bands.
Autor:
Harm van Seijen, Shimon Whiteson
Publikováno v:
9th International Conference on Intelligent Systems Design and Applications-ISDA'09, November 30-December 2, 2009, Pisa, Italy, 665-672
ISDA
2009 9th International Conference on Intelligent Systems Design and Applications (ISDA 2009): Pisa, Italy, 30 November-2 December 2009, 665-672
STARTPAGE=665;ENDPAGE=672;TITLE=2009 9th International Conference on Intelligent Systems Design and Applications (ISDA 2009): Pisa, Italy, 30 November-2 December 2009
ISDA
2009 9th International Conference on Intelligent Systems Design and Applications (ISDA 2009): Pisa, Italy, 30 November-2 December 2009, 665-672
STARTPAGE=665;ENDPAGE=672;TITLE=2009 9th International Conference on Intelligent Systems Design and Applications (ISDA 2009): Pisa, Italy, 30 November-2 December 2009
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based RL. By recording the agent's last-visit experience, the agent can delay
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1341e5d75e038dd48a92f65390bd938a
http://resolver.tudelft.nl/uuid:87daa5d4-0e9e-448e-af33-efa5d6c61997
http://resolver.tudelft.nl/uuid:87daa5d4-0e9e-448e-af33-efa5d6c61997
Publikováno v:
2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009, 30 March-2 April 2009, Nashville, TN, USA, 177-184
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 177-184
STARTPAGE=177;ENDPAGE=184;TITLE=Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
University of Groningen
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning: ADPRL
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
ADPRL
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 177-184
STARTPAGE=177;ENDPAGE=184;TITLE=Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
University of Groningen
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning: ADPRL
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
ADPRL
This paper presents a theoretical and empirical analysis of Expected Sarsa, a variation on Sarsa, the classic onpolicy temporal-difference method for model-free reinforcement learning. Expected Sarsa exploits knowledge about stochasticity in the beha
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::13a5d0c2059407fd54b09a93fb7338a7
http://resolver.tudelft.nl/uuid:0e868206-5237-42d0-8bc0-3deb94a679ef
http://resolver.tudelft.nl/uuid:0e868206-5237-42d0-8bc0-3deb94a679ef
Autor:
A. Puig-Molina, Narcis Mestres, Alexander Usoskin, Herbert C. Freyhardt, Juan C. González, Harm van Seijen, Francesc Alsina, H. Graafsma, Teresa Puig, Xavier Obradors
Publikováno v:
Scopus-Elsevier
Narcis Mestres Andreu
Narcis Mestres Andreu
Two novel complementary and non-destructive techniques for texture analysis of YBCO coated conductors are presented. Micro-Raman (μ-Raman) spectroscopy enables an easy analysis of the film homogeneity by determining the distribution of a- and c-orie
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bf7af05d80672bf7e041a24371c4fb2b
http://www.scopus.com/inward/record.url?eid=2-s2.0-85010767265&partnerID=MN8TOARS
http://www.scopus.com/inward/record.url?eid=2-s2.0-85010767265&partnerID=MN8TOARS