On the Residual of Mirror Descent Search and Scalability via Dimensionality Reduction

Autor:	Megumi Miyashita, Yuuki Murata, Toshiyuki Kondo, Shiro Yano
Rok vydání:	2018
Předmět:	Mathematical optimization Sampling distribution Computer science Dimensionality reduction Probability distribution Reinforcement learning Sampling (statistics) Variance (accounting) Residual Expected utility hypothesis
Zdroj:	2018 Seventh ICT International Student Project Conference (ICT-ISPC).
Popis:	In the the specific class of black-box optimization algorithms to find the optimal probabilistic distribution of some expected utility in reinforcement learning, higher dimensional decision variables cause the increase of cost and the slowing down of the learning speed. We clarified that the variance of the sampling probability distribution affects both for the cost and the learning speed. Especially, there exists the trade-0ff between the cost and the learning speed. In this paper, we propose two trick to improve both of the learning speed and the cost. First trick is to employ the small variance sampling distribution for improving the cost; it causes slower convergence as a side effect. As the second trick, we employed the dimensionality reduction of the decision variable for improving the learning speed. We evaluated the effects of these tricks with 2D-arm reaching task.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::63fd5f29a3cb17161ad9b8f4e80f33e1 https://doi.org/10.1109/ict-ispc.2018.8523865 Zobrazit plný text záznamu