Zobrazeno 1 - 10
of 13 754
pro vyhledávání: '"Koppel, A."'
Autor:
Lock, Edwin, Evans, Benjamin Patrick, Kreacic, Eleonora, Bhatt, Sujay, Koppel, Alec, Ganesh, Sumitra, Goldberg, Paul W.
We propose a decentralized market model in which agents can negotiate bilateral contracts. This builds on a similar, but centralized, model of trading networks introduced by Hatfield et al. (2013). Prior work has established that fully-substitutable
Externí odkaz:
http://arxiv.org/abs/2412.13972
Autor:
Park, Jung Yeon, Bhatt, Sujay, Zeng, Sihan, Wong, Lawson L. S., Koppel, Alec, Ganesh, Sumitra, Walters, Robin
Equivariant neural networks have shown great success in reinforcement learning, improving sample efficiency and generalization when there is symmetry in the task. However, in many problems, only approximate symmetry is present, which makes imposing e
Externí odkaz:
http://arxiv.org/abs/2411.04225
Autor:
Xu, Yuancheng, Sehwag, Udari Madhushani, Koppel, Alec, Zhu, Sicheng, An, Bang, Huang, Furong, Ganesh, Sumitra
Large Language Models (LLMs) exhibit impressive capabilities but require careful alignment with human preferences. Traditional training-time methods finetune LLMs using human preference datasets but incur significant training costs and require repeat
Externí odkaz:
http://arxiv.org/abs/2410.08193
The standard contextual bandit framework assumes fully observable and actionable contexts. In this work, we consider a new bandit setting with partially observable, correlated contexts and linear payoffs, motivated by the applications in finance wher
Externí odkaz:
http://arxiv.org/abs/2409.11521
We consider discrete-time stationary mean field games (MFG) with unknown dynamics and design algorithms for finding the equilibrium in finite time. Prior solutions to the problem build upon either the contraction assumption on a mean field optimality
Externí odkaz:
http://arxiv.org/abs/2408.04780
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities
Training large language models (LLMs) in low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce DictaLM2.0 and DictaLM2.0-Instruct, two LLMs derived from the Mistral model, trained on a substantial corpus of approx
Externí odkaz:
http://arxiv.org/abs/2407.07080
Autor:
Ding, Mucong, Chakraborty, Souradip, Agrawal, Vibhu, Che, Zora, Koppel, Alec, Wang, Mengdi, Bedi, Amrit, Huang, Furong
Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning large language models (LLMs) with human preferences. However, current offline alignment approaches like DPO, IPO, and SLiC rely heavily on fixed preference datasets, which
Externí odkaz:
http://arxiv.org/abs/2406.15567
In this paper, we study the problem of robust cooperative multi-agent reinforcement learning (RL) where a large number of cooperative agents with distributed information aim to learn policies in the presence of \emph{stochastic} and \emph{non-stochas
Externí odkaz:
http://arxiv.org/abs/2406.13992
The conditional mean embedding (CME) encodes Markovian stochastic kernels through their actions on probability distributions embedded within the reproducing kernel Hilbert spaces (RKHS). The CME plays a key role in several well-known machine learning
Externí odkaz:
http://arxiv.org/abs/2405.07432
Publikováno v:
In Proceedings of EACL 2023, 849-864 (2023)
Semitic morphologically-rich languages (MRLs) are characterized by extreme word ambiguity. Because most vowels are omitted in standard texts, many of the words are homographs with multiple possible analyses, each with a different pronunciation and di
Externí odkaz:
http://arxiv.org/abs/2405.07099