Zobrazeno 1 - 10
of 72
pro vyhledávání: '"Bai, Yu"'
Large Language Models (LLMs) often memorize sensitive, private, or copyrighted data during pre-training. LLM unlearning aims to eliminate the influence of undesirable data from the pre-trained model while preserving the model's utilities on other tas
Externí odkaz:
http://arxiv.org/abs/2404.05868
Inverse Reinforcement Learning (IRL) -- the problem of learning reward functions from demonstrations of an \emph{expert policy} -- plays a critical role in developing intelligent systems. While widely used in applications, theoretical understandings
Externí odkaz:
http://arxiv.org/abs/2312.00054
Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from uns
Externí odkaz:
http://arxiv.org/abs/2310.08566
Attention layers -- which map a sequence of inputs to a sequence of outputs -- are core building blocks of the Transformer architecture which has achieved significant breakthroughs in modern artificial intelligence. This paper presents a rigorous the
Externí odkaz:
http://arxiv.org/abs/2307.11353
This paper studies the sample-efficiency of learning in Partially Observable Markov Decision Processes (POMDPs), a challenging problem in reinforcement learning that is known to be exponentially hard in the worst-case. Motivated by real-world setting
Externí odkaz:
http://arxiv.org/abs/2307.02884
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the
Externí odkaz:
http://arxiv.org/abs/2306.04637
We study the problem of uncertainty quantification via prediction sets, in an online setting where the data distribution may vary arbitrarily over time. Recent work develops online conformal prediction techniques that leverage regret minimization alg
Externí odkaz:
http://arxiv.org/abs/2302.07869
A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where the description length of the game as well as the complexity of many existing learning algorithms scale exponentially with the number of agents. While
Externí odkaz:
http://arxiv.org/abs/2302.06606
We study offline multi-agent reinforcement learning (RL) in Markov games, where the goal is to learn an approximate equilibrium -- such as Nash equilibrium and (Coarse) Correlated Equilibrium -- from an offline dataset pre-collected from the game. Ex
Externí odkaz:
http://arxiv.org/abs/2302.02571
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging \emph{partially observable} setting. While it is well-established that learning in Partially Observable Markov Decision Processes (POMDPs) requires exponentia
Externí odkaz:
http://arxiv.org/abs/2302.01333