Výsledky vyhledávání - "Ying, Donghao"

Report

Bagging Improves Generalization Exponentially

Autor: Qian, Huajie, Ying, Donghao, Lam, Henry, Yin, Wotao

Bagging is a popular ensemble technique to improve the accuracy of machine learning models. It hinges on the well-established rationale that, by repeatedly retraining on resampled data, the aggregated model exhibits lower variance and hence higher st

Externí odkaz: http://arxiv.org/abs/2405.14741

Zobrazit plný text záznamu

Report

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Autor: Ying, Donghao, Zhang, Yunkai, Ding, Yuhao, Koppel, Alec, Lavaei, Javad

We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by {\it general utiliti

Externí odkaz: http://arxiv.org/abs/2305.17568

Zobrazit plný text záznamu

Report

No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

Autor: Guo, Mengzi Amy, Ying, Donghao, Lavaei, Javad, Shen, Zuo-Jun Max

This work is dedicated to the algorithm design in a competitive framework, with the primary goal of learning a stable equilibrium. We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lac

Externí odkaz: http://arxiv.org/abs/2305.17567

Zobrazit plný text záznamu

Report

A Hitting Time Analysis for Stochastic Time-Varying Functions with Applications to Adversarial Attacks on Computation of Markov Decision Processes

Autor: Yekkehkhany, Ali, Feng, Han, Ying, Donghao, Lavaei, Javad

Stochastic time-varying optimization is an integral part of learning in which the shape of the function changes over time in a non-deterministic manner. This paper considers multiple models of stochastic time variation and analyzes the corresponding

Externí odkaz: http://arxiv.org/abs/2302.11190

Zobrazit plný text záznamu

Report

Scalable Multi-Agent Reinforcement Learning with General Utilities

Autor: Ying, Donghao, Ding, Yuhao, Koppel, Alec, Lavaei, Javad

We study the scalable multi-agent reinforcement learning (MARL) with general utilities, defined as nonlinear functions of the team's long-term state-action occupancy measure. The objective is to find a localized policy that maximizes the average of t

Externí odkaz: http://arxiv.org/abs/2302.07938

Zobrazit plný text záznamu

Report

Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

Autor: Ying, Donghao, Guo, Mengzi Amy, Lee, Hyunin, Ding, Yuhao, Lavaei, Javad, Shen, Zuo-Jun Max

We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algori

Externí odkaz: http://arxiv.org/abs/2205.10715

Zobrazit plný text záznamu

Report

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

Autor: Ying, Donghao, Ding, Yuhao, Lavaei, Javad

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility. By l

Externí odkaz: http://arxiv.org/abs/2110.08923

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Policy-based Primal-Dual Methods for Convex Constrained Markov Decision Processes

Autor: Ying, Donghao, Guo, Mengzi Amy, Ding, Yuhao, Lavaei, Javad, Shen, Zuo-Jun Max

We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and the constraints are convex in the state-action occupancy measure. We propose a policy-based primal-dual algorithm that updates the primal variable via

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e98356b75dd69fdb913dca89092b3eb4
http://arxiv.org/abs/2205.10715

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání