Zobrazeno 1 - 10
of 798
pro vyhledávání: '"Yang, Wenhao"'
Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to i
Externí odkaz:
http://arxiv.org/abs/2406.10956
This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing p
Externí odkaz:
http://arxiv.org/abs/2406.03787
Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $\mathcal{
Externí odkaz:
http://arxiv.org/abs/2406.00489
To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a $T$-
Externí odkaz:
http://arxiv.org/abs/2405.19705
A package's source code repository records the development history of the package, providing indispensable information for the use and risk monitoring of the package. However, a package release often misses its source code repository due to the separ
Externí odkaz:
http://arxiv.org/abs/2404.16565
In this paper, we study distributional reinforcement learning from the perspective of statistical efficiency. We investigate distributional policy evaluation, aiming to estimate the complete distribution of the random return (denoted $\eta^\pi$) atta
Externí odkaz:
http://arxiv.org/abs/2309.17262
Autor:
Hua, Bobo, Yang, Wenhao
Mosconi proved Liouville theorems for ancient solutions of subexponential growth to the heat equation on a manifold with Ricci curvature bounded below. We extend these results to graphs with bounded geometry: for a graph with bounded geometry, any no
Externí odkaz:
http://arxiv.org/abs/2309.17250
Autor:
Kitamura, Toshinori, Kozuno, Tadashi, Tang, Yunhao, Vieillard, Nino, Valko, Michal, Yang, Wenhao, Mei, Jincheng, Ménard, Pierre, Azar, Mohammad Gheshlaghi, Munos, Rémi, Pietquin, Olivier, Geist, Matthieu, Szepesvári, Csaba, Kumagai, Wataru, Matsuo, Yutaka
Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL), has served as the basis for recent high-performing practical RL algorithms. However, despite the use of function appro
Externí odkaz:
http://arxiv.org/abs/2305.13185
Autor:
Wang, Yibo, Yang, Wenhao, Jiang, Wei, Lu, Shiyin, Wang, Bing, Tang, Haihong, Wan, Yuanyu, Zhang, Lijun
Projection-free online learning has drawn increasing interest due to its efficiency in solving high-dimensional problems with complicated constraints. However, most existing projection-free online methods focus on minimizing the static regret, which
Externí odkaz:
http://arxiv.org/abs/2305.11726
We propose a novel generalization of constrained Markov decision processes (CMDPs) that we call the \emph{semi-infinitely constrained Markov decision process} (SICMDP). Particularly, we consider a continuum of constraints instead of a finite number o
Externí odkaz:
http://arxiv.org/abs/2305.00254