Zobrazeno 1 - 10
of 78
pro vyhledávání: '"Srivastava, Rupesh Kumar"'
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference in the light of noisy data samples, then passed as input to a neur
Externí odkaz:
http://arxiv.org/abs/2308.07037
We consider the problem of generative modeling based on smoothing an unknown density of interest in $\mathbb{R}^d$ using factorial kernels with $M$ independent Gaussian channels with equal noise levels introduced by Saremi and Srivastava (2022). Firs
Externí odkaz:
http://arxiv.org/abs/2303.11669
Autor:
Toklu, Nihat Engin, Atkinson, Timothy, Micka, Vojtěch, Liskowski, Paweł, Srivastava, Rupesh Kumar
Evolutionary computation is an important component within various fields such as artificial intelligence research, reinforcement learning, robotics, industrial automation and/or optimization, engineering design, etc. Considering the increasing comput
Externí odkaz:
http://arxiv.org/abs/2302.12600
Autor:
Štrupl, Miroslav, Faccio, Francesco, Ashley, Dylan R., Schmidhuber, Jürgen, Srivastava, Rupesh Kumar
Upside-Down Reinforcement Learning (UDRL) is an approach for solving RL problems that does not require value functions and uses only supervised learning, where the targets for given inputs in a dataset do not change over time. Ghosh et al. proved tha
Externí odkaz:
http://arxiv.org/abs/2205.06595
Lately, there has been a resurgence of interest in using supervised learning to solve reinforcement learning problems. Recent work in this area has largely focused on learning command-conditioned policies. We investigate the potential of one such met
Externí odkaz:
http://arxiv.org/abs/2202.12742
Publikováno v:
International Conference on Learning Representations, 2022
We formally map the problem of sampling from an unknown distribution with a density in $\mathbb{R}^d$ to the problem of learning and sampling a smoother density in $\mathbb{R}^{Md}$ obtained by convolution with a fixed factorial kernel: the new densi
Externí odkaz:
http://arxiv.org/abs/2112.09822
Autor:
Štrupl, Miroslav, Faccio, Francesco, Ashley, Dylan R., Srivastava, Rupesh Kumar, Schmidhuber, Jürgen
Reward-Weighted Regression (RWR) belongs to a family of widely known iterative Reinforcement Learning algorithms based on the Expectation-Maximization framework. In this family, learning at each iteration consists of sampling a batch of trajectories
Externí odkaz:
http://arxiv.org/abs/2107.09088
Distribution-based search algorithms are an effective approach for evolutionary reinforcement learning of neural network controllers. In these algorithms, gradients of the total reward with respect to the policy parameters are estimated using a popul
Externí odkaz:
http://arxiv.org/abs/2008.02387
Autor:
Srivastava, Rupesh Kumar, Shyam, Pranav, Mutz, Filipe, Jaśkowski, Wojciech, Schmidhuber, Jürgen
We develop Upside-Down Reinforcement Learning (UDRL), a method for learning to act using only supervised learning techniques. Unlike traditional algorithms, UDRL does not use reward prediction or search for an optimal policy. Instead, it trains agent
Externí odkaz:
http://arxiv.org/abs/1912.02877
Autor:
Kidziński, Łukasz, Ong, Carmichael, Mohanty, Sharada Prasanna, Hicks, Jennifer, Carroll, Sean F., Zhou, Bo, Zeng, Hongsheng, Wang, Fan, Lian, Rongzhong, Tian, Hao, Jaśkowski, Wojciech, Andersen, Garrett, Lykkebø, Odd Rune, Toklu, Nihat Engin, Shyam, Pranav, Srivastava, Rupesh Kumar, Kolesnikov, Sergey, Hrinchuk, Oleksii, Pechenko, Anton, Ljungström, Mattias, Wang, Zhen, Hu, Xu, Hu, Zehong, Qiu, Minghui, Huang, Jun, Shpilman, Aleksei, Sosin, Ivan, Svidchenko, Oleg, Malysheva, Aleksandra, Kudenko, Daniel, Rane, Lance, Bhatt, Aditya, Wang, Zhengfei, Qi, Penghui, Yu, Zeyang, Peng, Peng, Yuan, Quan, Li, Wenxin, Tian, Yunsheng, Yang, Ruihan, Ma, Pingchuan, Khadka, Shauharda, Majumdar, Somdeb, Dwiel, Zach, Liu, Yinyin, Tumer, Evren, Watson, Jeremy, Salathé, Marcel, Levine, Sergey, Delp, Scott
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to desc
Externí odkaz:
http://arxiv.org/abs/1902.02441