Zobrazeno 1 - 10
of 50
pro vyhledávání: '"Fei, Yingjie"'
Autor:
Fei, Yingjie, Xu, Ruitu
We study risk-sensitive multi-agent reinforcement learning under general-sum Markov games, where agents optimize the entropic risk measure of rewards with possibly diverse risk preferences. We show that using the regret naively adapted from existing
Externí odkaz:
http://arxiv.org/abs/2405.02724
Autor:
Gupta, Ashim, Blum, Carter Wood, Choji, Temma, Fei, Yingjie, Shah, Shalin, Vempala, Alakananda, Srikumar, Vivek
Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classi
Externí odkaz:
http://arxiv.org/abs/2305.16444
Autor:
Fei, Yingjie, Xu, Ruitu
In this paper, we study gap-dependent regret guarantees for risk-sensitive reinforcement learning based on the entropic risk measure. We propose a novel definition of sub-optimality gaps, which we call cascaded gaps, and we discuss their key componen
Externí odkaz:
http://arxiv.org/abs/2203.03110
We study risk-sensitive reinforcement learning (RL) based on the entropic risk measure. Although existing works have established non-asymptotic regret guarantees for this problem, they leave open an exponential gap between the upper and lower bounds.
Externí odkaz:
http://arxiv.org/abs/2111.03947
Autor:
Yang, Shenglong, Wang, Lijun, Fei, Yingjie, Zhang, Shengmao, Yu, Linlin, Zhang, Heng, Wang, Fei, Wu, Yumei, Wu, Zuli, Wang, Wei, Shi, Jiayu, Jiang, Keji, Fan, Wei
Publikováno v:
In Regional Studies in Marine Science February 2024 70
We consider reinforcement learning (RL) in episodic MDPs with adversarial full-information reward feedback and unknown fixed transition kernels. We propose two model-free policy optimization algorithms, POWER and POWER++, and establish guarantees for
Externí odkaz:
http://arxiv.org/abs/2007.00148
We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels, where the goal is to optimize the total reward under the risk measure of exponential utility. We propose two provably efficient mode
Externí odkaz:
http://arxiv.org/abs/2006.13827
We develop a novel variant of the classical Frank-Wolfe algorithm, which we call spectral Frank-Wolfe, for convex optimization over a spectrahedron. The spectral Frank-Wolfe algorithm has a novel ingredient: it computes a few eigenvectors of the grad
Externí odkaz:
http://arxiv.org/abs/2006.01719
Autor:
Yang, Shenglong, Fei, Yingjie, Yu, Linlin, Tang, Fenghua, Zhang, Shengmao, Cheng, Tianfei, Fan, Wei, Yuan, Sanling, Zhang, Heng, Jiang, Keji
Publikováno v:
In Ecological Indicators November 2023 155
Autor:
Fei, Yingjie, Chen, Yudong
We study the statistical performance of semidefinite programming (SDP) relaxations for clustering under random graph models. Under the $\mathbb{Z}_{2}$ Synchronization model, Censored Block Model and Stochastic Block Model, we show that SDP achieves
Externí odkaz:
http://arxiv.org/abs/1904.09635