Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Arnold, S��bastien M. R."'
Autor:
Arnold, S��bastien M. R., Sha, Fei
Constructing new and more challenging tasksets is a fruitful methodology to analyse and understand few-shot classification methods. Unfortunately, existing approaches to building those tasksets are somewhat unsatisfactory: they either assume train an
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::9ef01160bb73c27b39e2cdacc97dbc69
Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. On the other hand, we have just started to understand and analyze how they are able to adapt fast to new tasks. For examp
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::43942ea5971a2cb9ae1a01dc368720b2
http://arxiv.org/abs/1910.13603
http://arxiv.org/abs/1910.13603
We study the variance of the REINFORCE policy gradient estimator in environments with continuous state and action spaces, linear dynamics, quadratic cost, and Gaussian noise. These simple environments allow us to derive bounds on the estimator varian
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7e6fc1fe5d89618d1b3c0f0b548c53f6
http://arxiv.org/abs/1910.01249
http://arxiv.org/abs/1910.01249
Autor:
Arnold, S��bastien M. R., Manzagol, Pierre-Antoine, Babanezhad, Reza, Mitliagkas, Ioannis, Roux, Nicolas Le
Most stochastic optimization methods use gradients once before discarding them. While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the on
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2fb98de608fb640321c4f931ac5f1df2
http://arxiv.org/abs/1906.03532
http://arxiv.org/abs/1906.03532
We present Shapechanger, a library for transfer reinforcement learning specifically designed for robotic tasks. We consider three types of knowledge transfer---from simulation to simulation, from simulation to real, and from real to real---and a wide
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8d7068e261a9f9c45af7aa3800c73097
http://arxiv.org/abs/1709.05070
http://arxiv.org/abs/1709.05070
We introduce a novel method to compute a rank $m$ approximation of the inverse of the Hessian matrix in the distributed regime. By leveraging the differences in gradients and parameters of multiple Workers, we are able to efficiently implement a dist
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::458c20801973ae10f349244409264d6c