Zobrazeno 1 - 10
of 38
pro vyhledávání: '"Bakhtin, Anton"'
Autor:
Durmus, Esin, Nguyen, Karina, Liao, Thomas I., Schiefer, Nicholas, Askell, Amanda, Bakhtin, Anton, Chen, Carol, Hatfield-Dodds, Zac, Hernandez, Danny, Joseph, Nicholas, Lovitt, Liane, McCandlish, Sam, Sikder, Orowa, Tamkin, Alex, Thamkul, Janel, Kaplan, Jared, Clark, Jack, Ganguli, Deep
Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dat
Externí odkaz:
http://arxiv.org/abs/2306.16388
Autor:
Bakhtin, Anton, Wu, David J, Lerer, Adam, Gray, Jonathan, Jacob, Athul Paul, Farina, Gabriele, Miller, Alexander H, Brown, Noam
No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games
Externí odkaz:
http://arxiv.org/abs/2210.05492
Autor:
Hu, Hengyuan, Sokota, Samuel, Wu, David, Bakhtin, Anton, Lupu, Andrei, Cui, Brandon, Foerster, Jakob N.
Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world. In this paper, we focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs). SEDs are ac
Externí odkaz:
http://arxiv.org/abs/2207.12322
Autor:
Jacob, Athul Paul, Wu, David J., Farina, Gabriele, Lerer, Adam, Hu, Hengyuan, Bakhtin, Anton, Andreas, Jacob, Brown, Noam
We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior. Imitation learning is effective at predicting human actions but may not match the strength of expert humans, wh
Externí odkaz:
http://arxiv.org/abs/2112.07544
Prior AI successes in complex games have largely focused on settings with at most hundreds of actions at each decision point. In contrast, Diplomacy is a game with more than 10^20 possible actions per turn. Previous attempts to address games with lar
Externí odkaz:
http://arxiv.org/abs/2110.02924
A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout
Externí odkaz:
http://arxiv.org/abs/2102.10336
Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings. In contrast, Diplomacy is a game of shifting alliances that involves both cooperation and competition. For this reason, Diplomacy ha
Externí odkaz:
http://arxiv.org/abs/2010.02923
The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prio
Externí odkaz:
http://arxiv.org/abs/2007.13544
Publikováno v:
ICLR 2020
Text generation is ubiquitous in many NLP tasks, from summarization, to dialogue and machine translation. The dominant parametric approach is based on locally normalized models which predict one word at a time. While these work remarkably well, they
Externí odkaz:
http://arxiv.org/abs/2004.11714
Publikováno v:
Journal of Machine Learning Research 21 (2020) 1-41
Current large-scale auto-regressive language models display impressive fluency and can generate convincing text. In this work we start by asking the question: Can the generations of these models be reliably distinguished from real text by statistical
Externí odkaz:
http://arxiv.org/abs/2004.10188