Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Tarassov, Eugene"'
Autor:
Richemond, Pierre Harvey, Tang, Yunhao, Guo, Daniel, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rafailov, Rafael, Pires, Bernardo Avila, Tarassov, Eugene, Spangher, Lucas, Ellsworth, Will, Severyn, Aliaksei, Mallinson, Jonathan, Shani, Lior, Shamir, Gil, Joshi, Rishabh, Liu, Tianqi, Munos, Remi, Piot, Bilal
The dominant framework for alignment of large language models (LLM), whether through reinforcement learning from human feedback or direct preference optimisation, is to learn from preference data. This involves building datasets where each element is
Externí odkaz:
http://arxiv.org/abs/2405.19107
Autor:
Tang, Yunhao, Guo, Daniel Zhaohan, Zheng, Zeyu, Calandriello, Daniele, Cao, Yuan, Tarassov, Eugene, Munos, Rémi, Pires, Bernardo Ávila, Valko, Michal, Cheng, Yong, Dabney, Will
Reinforcement learning from human feedback (RLHF) is the canonical framework for large language model alignment. However, rising popularity in offline alignment algorithms challenge the need for on-policy sampling in RLHF. Within the context of rewar
Externí odkaz:
http://arxiv.org/abs/2405.08448
Autor:
Gemp, Ian, Anthony, Thomas, Bachrach, Yoram, Bhoopchand, Avishkar, Bullard, Kalesha, Connor, Jerome, Dasagi, Vibhavari, De Vylder, Bart, Duenez-Guzman, Edgar, Elie, Romuald, Everett, Richard, Hennes, Daniel, Hughes, Edward, Khan, Mina, Lanctot, Marc, Larson, Kate, Lever, Guy, Liu, Siqi, Marris, Luke, McKee, Kevin R., Muller, Paul, Perolat, Julien, Strub, Florian, Tacchetti, Andrea, Tarassov, Eugene, Wang, Zhe, Tuyls, Karl
The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d hu
Externí odkaz:
http://arxiv.org/abs/2209.10958
Autor:
Perolat, Julien, de Vylder, Bart, Hennes, Daniel, Tarassov, Eugene, Strub, Florian, de Boer, Vincent, Muller, Paul, Connor, Jerome T., Burch, Neil, Anthony, Thomas, McAleer, Stephen, Elie, Romuald, Cen, Sarah H., Wang, Zhe, Gruslys, Audrunas, Malysheva, Aleksandra, Khan, Mina, Ozair, Sherjil, Timbers, Finbarr, Pohlen, Toby, Eccles, Tom, Rowland, Mark, Lanctot, Marc, Lespiau, Jean-Baptiste, Piot, Bilal, Omidshafiei, Shayegan, Lockhart, Edward, Sifre, Laurent, Beauguerlange, Nathalie, Munos, Remi, Silver, David, Singh, Satinder, Hassabis, Demis, Tuyls, Karl
We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet maste
Externí odkaz:
http://arxiv.org/abs/2206.15378
Autor:
Omidshafiei, Shayegan, Hennes, Daniel, Garnelo, Marta, Tarassov, Eugene, Wang, Zhe, Elie, Romuald, Connor, Jerome T., Muller, Paul, Graham, Ian, Spearman, William, Tuyls, Karl
In multiagent environments, several decision-making individuals interact while adhering to the dynamics constraints imposed by the environment. These interactions, combined with the potential stochasticity of the agents' decision-making processes, ma
Externí odkaz:
http://arxiv.org/abs/2106.04219
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.