Zobrazeno 1 - 10
of 781
pro vyhledávání: '"Sham M"'
Autor:
Jelassi, Samy, Mohri, Clara, Brandfonbrener, David, Gu, Alex, Vyas, Nikhil, Anand, Nikhil, Alvarez-Melis, David, Li, Yuanzhi, Kakade, Sham M., Malach, Eran
The Mixture-of-Experts (MoE) architecture enables a significant increase in the total number of model parameters with minimal computational overhead. However, it is not clear what performance tradeoffs, if any, exist between MoEs and standard dense t
Externí odkaz:
http://arxiv.org/abs/2410.19034
We initiate the study of Multi-Agent Reinforcement Learning from Human Feedback (MARLHF), exploring both theoretical foundations and empirical validations. We define the task as identifying Nash equilibrium from a preference-only offline dataset in g
Externí odkaz:
http://arxiv.org/abs/2409.00717
Autor:
Wang, Ziqi, Zhang, Hanlin, Li, Xiner, Huang, Kuan-Hao, Han, Chi, Ji, Shuiwang, Kakade, Sham M., Peng, Hao, Ji, Heng
Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness
Externí odkaz:
http://arxiv.org/abs/2407.01100
Autor:
Zhang, Edwin, Zhu, Vincent, Saphra, Naomi, Kleiman, Anat, Edelman, Benjamin L., Tambe, Milind, Kakade, Sham M., Malach, Eran
Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outp
Externí odkaz:
http://arxiv.org/abs/2406.11741
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approxi
Externí odkaz:
http://arxiv.org/abs/2406.08466
Autor:
Shen, Ethan, Fan, Alan, Pratt, Sarah M., Park, Jae Sung, Wallingford, Matthew, Kakade, Sham M., Holtzman, Ari, Krishna, Ranjay, Farhadi, Ali, Kusupati, Aditya
Many applications today provide users with multiple auto-complete drafts as they type, including GitHub's code completion, Gmail's smart compose, and Apple's messaging auto-suggestions. Under the hood, language models support this by running an autor
Externí odkaz:
http://arxiv.org/abs/2405.18400
The $k$-parity problem is a classical problem in computational complexity and algorithmic theory, serving as a key benchmark for understanding computational classes. In this paper, we solve the $k$-parity problem with stochastic gradient descent (SGD
Externí odkaz:
http://arxiv.org/abs/2404.12376
Transformers are the dominant architecture for sequence modeling, but there is growing interest in models that use a fixed-size latent state that does not depend on the sequence length, which we refer to as "generalized state space models" (GSSMs). I
Externí odkaz:
http://arxiv.org/abs/2402.01032
We consider the problem of decentralized multi-agent reinforcement learning in Markov games. A fundamental question is whether there exist algorithms that, when adopted by all agents and run independently in a decentralized fashion, lead to no-regret
Externí odkaz:
http://arxiv.org/abs/2303.12287
This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called G
Externí odkaz:
http://arxiv.org/abs/2303.02255