Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Chandak, Yash"'
Autor:
Lee, Jonathan N., Xie, Annie, Pacchiano, Aldo, Chandak, Yash, Finn, Chelsea, Nachum, Ofir, Brunskill, Emma
Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context learning capabili
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b4400eba61e9bbcdad1844df9586524
http://arxiv.org/abs/2306.14892
http://arxiv.org/abs/2306.14892
Autor:
Kostas, James E., Jordan, Scott M., Chandak, Yash, Theocharous, Georgios, Gupta, Dhawal, White, Martha, da Silva, Bruno Castro, Thomas, Philip S.
Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpr
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c1052ad7e823707ca3dcad2409f0b8f0
http://arxiv.org/abs/2305.09838
http://arxiv.org/abs/2305.09838
Autor:
Chandak, Yash, Thakoor, Shantanu, Guo, Zhaohan Daniel, Tang, Yunhao, Munos, Remi, Dabney, Will, Borsa, Diana L
Representation learning and exploration are among the key challenges for any deep reinforcement learning agent. In this work, we provide a singular value decomposition based method that can be used to obtain representations that preserve the underlyi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8e049dc56c75b278c82fa44306497334
Autor:
Chandak, Yash, Shankar, Shiv, Bastian, Nathaniel D., da Silva, Bruno Castro, Brunskil, Emma, Thomas, Philip S.
Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to exte
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::52856c68ea94985379bd5db39b76260d
Autor:
Tang, Yunhao, Guo, Zhaohan Daniel, Richemond, Pierre Harvey, Pires, Bernardo Ávila, Chandak, Yash, Munos, Rémi, Rowland, Mark, Azar, Mohammad Gheshlaghi, Lan, Charline Le, Lyle, Clare, György, András, Thakoor, Shantanu, Dabney, Will, Piot, Bilal, Calandriello, Daniele, Valko, Michal
We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical succe
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3c9c957441083131b4a6c939b4480458
Many real-world sequential decision-making problems involve critical systems with financial risks and human-life risks. While several works in the past have proposed methods that are safe for deployment, they assume that the underlying problem is sta
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::cbbe2b153e084a687d46c40beb982700