Zobrazeno 1 - 10
of 17 164
pro vyhledávání: '"Koppel"'
We study the problem of finding an equilibrium of a mean field game (MFG) -- a policy performing optimally in a Markov decision process (MDP) determined by the induced mean field, where the mean field is a distribution over a population of agents and
Externí odkaz:
http://arxiv.org/abs/2408.04780
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities
Training large language models (LLMs) in low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce DictaLM2.0 and DictaLM2.0-Instruct, two LLMs derived from the Mistral model, trained on a substantial corpus of approx
Externí odkaz:
http://arxiv.org/abs/2407.07080
Autor:
Härmas Riinu, Palm Rasmus, Koppel Miriam, Kalder Laura, Russina Margarita, Kurig Heisi, Härk Eneli, Aruväli Jaan, Tallo Indrek, Embs Jan P., Lust Enn
Publikováno v:
EPJ Web of Conferences, Vol 286, p 05001 (2023)
Microporous carbon materials are promising for hydrogen storage due to their structural variety, high specific surface area, large pore volume and relatively low cost. Carbide-derived carbons are highly valued as model materials because their porous
Externí odkaz:
https://doaj.org/article/6bffa6d2910140708a2445959cff1d77
Autor:
Ding, Mucong, Chakraborty, Souradip, Agrawal, Vibhu, Che, Zora, Koppel, Alec, Wang, Mengdi, Bedi, Amrit, Huang, Furong
Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning large language models (LLMs) with human preferences. However, current offline alignment approaches like DPO, IPO, and SLiC rely heavily on fixed preference datasets, which
Externí odkaz:
http://arxiv.org/abs/2406.15567
In this paper, we study the problem of robust cooperative multi-agent reinforcement learning (RL) where a large number of cooperative agents with distributed information aim to learn policies in the presence of \emph{stochastic} and \emph{non-stochas
Externí odkaz:
http://arxiv.org/abs/2406.13992
The conditional mean embedding (CME) encodes Markovian stochastic kernels through their actions on probability distributions embedded within the reproducing kernel Hilbert spaces (RKHS). The CME plays a key role in several well-known machine learning
Externí odkaz:
http://arxiv.org/abs/2405.07432
Publikováno v:
In Proceedings of EACL 2023, 849-864 (2023)
Semitic morphologically-rich languages (MRLs) are characterized by extreme word ambiguity. Because most vowels are omitted in standard texts, many of the words are homographs with multiple possible analyses, each with a different pronunciation and di
Externí odkaz:
http://arxiv.org/abs/2405.07099
Autor:
Koppel, Paula D, De Gagne, Jennie C
Publikováno v:
JMIR Research Protocols, Vol 10, Iss 6, p e27940 (2021)
BackgroundTelehealth videoconferencing has largely been embraced by health care providers and patients during the COVID-19 pandemic; however, little is known about specific techniques for building rapport and provider-patient relationships in this ca
Externí odkaz:
https://doaj.org/article/2d28e6c4605b4524a165edec97aa82c5
Autor:
Patel, Bhrij, Suttle, Wesley A., Koppel, Alec, Aggarwal, Vaneet, Sadler, Brian M., Bedi, Amrit Singh, Manocha, Dinesh
In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution, poses a significant challeng
Externí odkaz:
http://arxiv.org/abs/2403.11925
We address in this paper Reinforcement Learning (RL) among agents that are grouped into teams such that there is cooperation within each team but general-sum (non-zero sum) competition across different teams. To develop an RL method that provably ach
Externí odkaz:
http://arxiv.org/abs/2403.11345