Výsledky vyhledávání - "Kobayashi, Taisuke"

Report

Revisiting Experience Replayable Conditions

Autor: Kobayashi, Taisuke

Experience replay (ER) used in (deep) reinforcement learning is considered to be applicable only to off-policy algorithms. However, there have been some cases in which ER has been applied for on-policy algorithms, suggesting that off-policyness might

Externí odkaz: http://arxiv.org/abs/2402.10374

Zobrazit plný text záznamu

Report

Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward

Autor: Kobayashi, Taisuke

Robot control using reinforcement learning has become popular, but its learning process generally terminates halfway through an episode for safety and time-saving reasons. This study addresses the problem of the most popular exception handling that t

Externí odkaz: http://arxiv.org/abs/2308.12772

Zobrazit plný text záznamu

Report

Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint

Autor: Kobayashi, Taisuke

Soft actor-critic (SAC) in reinforcement learning is expected to be one of the next-generation robot control schemes. Its ability to maximize policy entropy would make a robotic controller robust to noise and perturbation, which is useful for real-wo

Externí odkaz: http://arxiv.org/abs/2303.04356

Zobrazit plný text záznamu

Report

Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search

Autor: Kobayashi, Taisuke

Publikováno v: Results in Control and Optimization, 2023

This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search. While various bonuses have been designed to date, they are analogous to the depth-firs

Externí odkaz: http://arxiv.org/abs/2212.10765

Zobrazit plný text záznamu

Report

Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration

Autor: Kobayashi, Taisuke, Fukumoto, Kota

Sampling-based model predictive control (MPC) can be applied to versatile robotic systems. However, the real-time control with it is a big challenge due to its unstable updates and poor convergence. This paper tackles this challenge with a novel deri

Externí odkaz: http://arxiv.org/abs/2212.04298

Zobrazit plný text záznamu

Report

Sparse Representation Learning with Modified q-VAE towards Minimal Realization of World Model

Autor: Kobayashi, Taisuke, Watanuki, Ryoma

Publikováno v: Advanced Robotics, 2023

Extraction of low-dimensional latent space from high-dimensional observation data is essential to construct a real-time robot controller with a world model on the extracted latent space. However, there is no established method for tuning the dimensio

Externí odkaz: http://arxiv.org/abs/2208.03936

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Proximal Policy Optimization with Adaptive Threshold for Symmetric Relative Density Ratio

Autor: Kobayashi, Taisuke

Publikováno v: Results in Control and Optimization, 2023

Deep reinforcement learning (DRL) is one of the promising approaches for introducing robots into complicated environments. The recent remarkable progress of DRL stands on regularization of policy, which allows the policy to improve stably and efficie

Externí odkaz: http://arxiv.org/abs/2203.09809

Zobrazit plný text záznamu

Report

Consolidated Adaptive T-soft Update for Deep Reinforcement Learning

Autor: Kobayashi, Taisuke

Demand for deep reinforcement learning (DRL) is gradually increased to enable robots to perform complex tasks, while DRL is known to be unstable. As a technique to stabilize its learning, a target network that slowly and asymptotically matches a main

Externí odkaz: http://arxiv.org/abs/2202.12504

Zobrazit plný text záznamu

Report

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

Autor: Kobayashi, Taisuke

Publikováno v: IROS 2022

This paper proposes a new regularization technique for reinforcement learning (RL) towards making policy and value functions smooth and stable. RL is known for the instability of the learning process and the sensitivity of the acquired policy to nois

Externí odkaz: http://arxiv.org/abs/2202.07152

Zobrazit plný text záznamu

Report

AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization

Autor: Ilboudo, Wendyam Eric Lionel, Kobayashi, Taisuke, Matsubara, Takamitsu

Publikováno v: Neurocomputing 2023-08

With the increasing practicality of deep learning applications, practitioners are inevitably faced with datasets corrupted by noise from various sources such as measurement errors, mislabeling, and estimated surrogate inputs/outputs that can adversel

Externí odkaz: http://arxiv.org/abs/2201.06714

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání