Zobrazeno 1 - 10
of 20
pro vyhledávání: '"Ni, Tianwei"'
Autor:
Dong, Pingcheng, Tan, Yonghao, Zhang, Dong, Ni, Tianwei, Liu, Xuejiao, Liu, Yu, Luo, Peng, Liang, Luhong, Liu, Shih-Yang, Huang, Xijie, Zhu, Huaiyu, Pan, Yun, An, Fengwei, Cheng, Kwang-Ting
Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs. Previous state-of-the-art works optimize these operations by piece-wise linear approximation and st
Externí odkaz:
http://arxiv.org/abs/2403.19591
A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy. However, this method often becomes impractical for
Externí odkaz:
http://arxiv.org/abs/2402.05290
Autor:
Ni, Tianwei, Eysenbach, Benjamin, Seyedsalehi, Erfan, Ma, Michel, Gehring, Clement, Mahajan, Aditya, Bacon, Pierre-Luc
Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks hav
Externí odkaz:
http://arxiv.org/abs/2401.08898
Reinforcement learning (RL) algorithms face two distinct challenges: learning effective representations of past and present observations, and determining how actions influence future returns. Both challenges involve modeling long-term dependencies. T
Externí odkaz:
http://arxiv.org/abs/2307.03864
Deep reinforcement learning has shown promising results on an abundance of robotic tasks in simulation, including visual navigation and manipulation. Prior work generally aims to build embodied agents that solve their assigned tasks as quickly as pos
Externí odkaz:
http://arxiv.org/abs/2112.12612
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with memory-based architectures, such as recurrent neural networks, provides a ge
Externí odkaz:
http://arxiv.org/abs/2110.05038
Autor:
Ni, Tianwei, Li, Huao, Agrawal, Siddharth, Raja, Suhas, Jia, Fan, Gui, Yikang, Hughes, Dana, Lewis, Michael, Sycara, Katia
Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-
Externí odkaz:
http://arxiv.org/abs/2103.04439
Imitation learning is well-suited for robotic tasks where it is difficult to directly program the behavior or specify a cost for optimal control. In this work, we propose a method for learning the reward function (and the corresponding policy) to mat
Externí odkaz:
http://arxiv.org/abs/2011.04709
Autor:
Wang, Yufei, Ni, Tianwei
Exploration-exploitation dilemma has long been a crucial issue in reinforcement learning. In this paper, we propose a new approach to automatically balance between these two. Our method is built upon the Soft Actor-Critic (SAC) algorithm, which uses
Externí odkaz:
http://arxiv.org/abs/2007.01932
We focus on an important yet challenging problem: using a 2D deep network to deal with 3D segmentation for medical image analysis. Existing approaches either applied multi-view planar (2D) networks or directly used volumetric (3D) networks for this p
Externí odkaz:
http://arxiv.org/abs/1812.00518