Výsledky vyhledávání - "Shangtong, Zhang"

Deep Residual Reinforcement Learning (Extended Abstract)

Autor: Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson

Publikováno v: IJCAI

We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperfo

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::140f27f3c5cdd55cf865886998a4967f
https://doi.org/10.24963/ijcai.2021/668

Zobrazit plný text záznamu

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

Autor: Shangtong Zhang, Hengshuai Yao

Publikováno v: AAAI

In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides t

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2fb85977d1a924672f60c7d940cb25ac
https://doi.org/10.1609/aaai.v33i01.33015789

Zobrazit plný text záznamu

QUOTA: The Quantile Option Architecture for Reinforcement Learning

Autor: Hengshuai Yao, Shangtong Zhang

Publikováno v: AAAI

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUO

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2e77e70e374671ed1de7bac67e9dcd45
https://doi.org/10.1609/aaai.v33i01.33015797

Zobrazit plný text záznamu

Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning

Autor: Shangtong Zhang, Bo Liu, Shimon Whiteson

We present a mean-variance policy iteration (MVPI) framework for risk-averse control in a discounted infinite horizon MDP optimizing the variance of a per-step reward random variable. MVPI enjoys great flexibility in that any policy evaluation method

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b5f66a4b80f37d185af1408b7a2dc359
http://arxiv.org/abs/2004.10888

Zobrazit plný text záznamu

Akademický článek

Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch.

Autor: Shangtong Zhang¹ SHANGTONG@VIRGINIA.EDU, des Combes, Remi Tachet² REMI.TACHET@MICROSOFT.COM, Laroche, Romain² ROMAIN.LAROCHE@MICROSOFT.COM

Publikováno v: Journal of Machine Learning Research. 2022, Vol. 23, p1-91. 91p.

Zobrazit plný text záznamu

Akademický článek

Truncated Emphatic Temporal Difference Methods for Prediction and Control.

Autor: Shangtong Zhang¹ SHANGTONG.ZHANG@CS.OX.AC.UK, Whiteson, Shimon¹ SHIMON.WHITESON@CS.OX.AC.UK

Publikováno v: Journal of Machine Learning Research. 2022, Vol. 23, p1-59. 59p.

Zobrazit plný text záznamu

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards

Autor: Jianyi Wang, Mai Xu, Andrzej Wojcicki, Shangtong Zhang, Thomas Lukasiewicz, Zhenghua Xu, Yuhang Song

Publikováno v: AAAI

Intrinsic rewards were introduced to simulate how human intelligence works; they are usually evaluated by intrinsically-motivated play, i.e., playing games without extrinsic rewards but evaluated with extrinsic rewards. However, none of the existing

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8c21d98d71667799013327c58433d9d8
http://arxiv.org/abs/1905.04640

Zobrazit plný text záznamu

Relative In Vitro Rates of Attachment and Penetration of Hydrogel Soft Contact Lenses by Haplotypes of Fusarium

Autor: Donald G, Ahearn, Donald G, Ahear, Shangtong, Zhang, R Doyle, Stulting, Brian L, Schwam, Robert B, Simmons, Michael A, Ward, George E, Pierce, Sidney A, Crow

Publikováno v: Cornea. 28:447-450

Purpose To investigate the relative abilities of different haplotypes of the Fusarium solani (FSSC)-Fusarium oxysporum (FOSC) complexes to attach to and invade hydrogel contact lenses. Methods Silicone hydrogel and traditional hydroxyethylmethacrylat

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::89bad963fea15a9d65d106d143c64cde
https://doi.org/10.1097/ico.0b013e31818d33fb

Zobrazit plný text záznamu

Differences Among Strains of the Fusarium oxysporum-F. solani Complexes in Their Penetration of Hydrogel Contact Lenses and Subsequent Susceptibility to Multipurpose Contact Lens Disinfection Solutions

Autor: Donald G. Ahearn, Brian L. Schwam, Robert B. Simmons, R.D. Stulting, Sidney A. Crow, George E. Pierce, Shangtong Zhang

Publikováno v: Cornea. 26:1249-1254

PURPOSE To examine in vitro conditions for attachment and penetration of silicone hydrogel (SH) lenses by clinical isolates of the Fusarium oxysporum-F. solani complexes and the relative susceptibilities of the fusaria in the lens matrices to multipu

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9a01f859f1f9f939a138f3303af72d80
https://doi.org/10.1097/ico.0b013e318148bd9a

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání