Výsledky vyhledávání

Report

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

Autor: Dong, Pingcheng, Tan, Yonghao, Zhang, Dong, Ni, Tianwei, Liu, Xuejiao, Liu, Yu, Luo, Peng, Liang, Luhong, Liu, Shih-Yang, Huang, Xijie, Zhu, Huaiyu, Pan, Yun, An, Fengwei, Cheng, Kwang-Ting

Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs. Previous state-of-the-art works optimize these operations by piece-wise linear approximation and st

Externí odkaz: http://arxiv.org/abs/2403.19591

Zobrazit plný text záznamu

Report

Do Transformer World Models Give Better Policy Gradients?

Autor: Ma, Michel, Ni, Tianwei, Gehring, Clement, D'Oro, Pierluca, Bacon, Pierre-Luc

A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy. However, this method often becomes impractical for

Externí odkaz: http://arxiv.org/abs/2402.05290

Zobrazit plný text záznamu

Report

Bridging State and History Representations: Understanding Self-Predictive RL

Autor: Ni, Tianwei, Eysenbach, Benjamin, Seyedsalehi, Erfan, Ma, Michel, Gehring, Clement, Mahajan, Aditya, Bacon, Pierre-Luc

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks hav

Externí odkaz: http://arxiv.org/abs/2401.08898

Zobrazit plný text záznamu

Report

When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment

Autor: Ni, Tianwei, Ma, Michel, Eysenbach, Benjamin, Bacon, Pierre-Luc

Reinforcement learning (RL) algorithms face two distinct challenges: learning effective representations of past and present observations, and determining how actions influence future returns. Both challenges involve modeling long-term dependencies. T

Externí odkaz: http://arxiv.org/abs/2307.03864

Zobrazit plný text záznamu

Report

Towards Disturbance-Free Visual Mobile Manipulation

Autor: Ni, Tianwei, Ehsani, Kiana, Weihs, Luca, Salvador, Jordi

Deep reinforcement learning has shown promising results on an abundance of robotic tasks in simulation, including visual navigation and manipulation. Prior work generally aims to build embodied agents that solve their assigned tasks as quickly as pos

Externí odkaz: http://arxiv.org/abs/2112.12612

Zobrazit plný text záznamu

Report

Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs

Autor: Ni, Tianwei, Eysenbach, Benjamin, Salakhutdinov, Ruslan

Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with memory-based architectures, such as recurrent neural networks, provides a ge

Externí odkaz: http://arxiv.org/abs/2110.05038

Zobrazit plný text záznamu

Report

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Autor: Ni, Tianwei, Li, Huao, Agrawal, Siddharth, Raja, Suhas, Jia, Fan, Gui, Yikang, Hughes, Dana, Lewis, Michael, Sycara, Katia

Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-

Externí odkaz: http://arxiv.org/abs/2103.04439

Zobrazit plný text záznamu

Report

f-IRL: Inverse Reinforcement Learning via State Marginal Matching

Autor: Ni, Tianwei, Sikchi, Harshit, Wang, Yufei, Gupta, Tejus, Lee, Lisa, Eysenbach, Benjamin

Imitation learning is well-suited for robotic tasks where it is difficult to directly program the behavior or specify a cost for optimal control. In this work, we propose a method for learning the reward function (and the corresponding policy) to mat

Externí odkaz: http://arxiv.org/abs/2011.04709

Zobrazit plný text záznamu

Report

Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient

Autor: Wang, Yufei, Ni, Tianwei

Exploration-exploitation dilemma has long been a crucial issue in reinforcement learning. In this paper, we propose a new approach to automatically balance between these two. Our method is built upon the Soft Actor-Critic (SAC) algorithm, which uses

Externí odkaz: http://arxiv.org/abs/2007.01932

Zobrazit plný text záznamu

Report

Elastic Boundary Projection for 3D Medical Image Segmentation

Autor: Ni, Tianwei, Xie, Lingxi, Zheng, Huangjie, Fishman, Elliot K., Yuille, Alan L.

We focus on an important yet challenging problem: using a 2D deep network to deal with 3D segmentation for medical image analysis. Existing approaches either applied multi-view planar (2D) networks or directly used volumetric (3D) networks for this p

Externí odkaz: http://arxiv.org/abs/1812.00518

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání