Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Shangtong, Zhang"'
Publikováno v:
IJCAI
We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperfo
Autor:
Shangtong Zhang, Hengshuai Yao
Publikováno v:
AAAI
In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides t
Autor:
Hengshuai Yao, Shangtong Zhang
Publikováno v:
AAAI
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUO
We present a mean-variance policy iteration (MVPI) framework for risk-averse control in a discounted infinite horizon MDP optimizing the variance of a per-step reward random variable. MVPI enjoys great flexibility in that any policy evaluation method
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b5f66a4b80f37d185af1408b7a2dc359
http://arxiv.org/abs/2004.10888
http://arxiv.org/abs/2004.10888
Autor:
Shangtong Zhang1 SHANGTONG@VIRGINIA.EDU, des Combes, Remi Tachet2 REMI.TACHET@MICROSOFT.COM, Laroche, Romain2 ROMAIN.LAROCHE@MICROSOFT.COM
Publikováno v:
Journal of Machine Learning Research. 2022, Vol. 23, p1-91. 91p.
Autor:
Shangtong Zhang1 SHANGTONG.ZHANG@CS.OX.AC.UK, Whiteson, Shimon1 SHIMON.WHITESON@CS.OX.AC.UK
Publikováno v:
Journal of Machine Learning Research. 2022, Vol. 23, p1-59. 59p.
Autor:
Jianyi Wang, Mai Xu, Andrzej Wojcicki, Shangtong Zhang, Thomas Lukasiewicz, Zhenghua Xu, Yuhang Song
Publikováno v:
AAAI
Intrinsic rewards were introduced to simulate how human intelligence works; they are usually evaluated by intrinsically-motivated play, i.e., playing games without extrinsic rewards but evaluated with extrinsic rewards. However, none of the existing
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8c21d98d71667799013327c58433d9d8
http://arxiv.org/abs/1905.04640
http://arxiv.org/abs/1905.04640
Autor:
Marcus Edel, Sumedh Ghaisas, Yannis Mentekidis, Ryan R. Curtin, Shangtong Zhang, Mikhail Lozhnikov
Publikováno v:
Journal of Open Source Software. 3:726
Autor:
Donald G, Ahearn, Donald G, Ahear, Shangtong, Zhang, R Doyle, Stulting, Brian L, Schwam, Robert B, Simmons, Michael A, Ward, George E, Pierce, Sidney A, Crow
Publikováno v:
Cornea. 28:447-450
Purpose To investigate the relative abilities of different haplotypes of the Fusarium solani (FSSC)-Fusarium oxysporum (FOSC) complexes to attach to and invade hydrogel contact lenses. Methods Silicone hydrogel and traditional hydroxyethylmethacrylat
Autor:
Donald G. Ahearn, Brian L. Schwam, Robert B. Simmons, R.D. Stulting, Sidney A. Crow, George E. Pierce, Shangtong Zhang
Publikováno v:
Cornea. 26:1249-1254
PURPOSE To examine in vitro conditions for attachment and penetration of silicone hydrogel (SH) lenses by clinical isolates of the Fusarium oxysporum-F. solani complexes and the relative susceptibilities of the fusaria in the lens matrices to multipu