Výsledky vyhledávání - "Balazadeh, Vahid"

Report

Personalized Adaptation via In-Context Preference Learning

Autor: Lau, Allison, Choi, Younwoo, Balazadeh, Vahid, Chidambaram, Keertana, Syrgkanis, Vasilis, Krishnan, Rahul G.

Reinforcement Learning from Human Feedback (RLHF) is widely used to align Language Models (LMs) with human preferences. However, existing approaches often neglect individual user preferences, leading to suboptimal personalization. We present the Pref

Externí odkaz: http://arxiv.org/abs/2410.14001

Zobrazit plný text záznamu

Report

Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

Autor: Balazadeh, Vahid, Chidambaram, Keertana, Nguyen, Viet, Krishnan, Rahul G., Syrgkanis, Vasilis

We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be viewed as solving related but slightly different ta

Externí odkaz: http://arxiv.org/abs/2404.07266

Zobrazit plný text záznamu

Report

Order-based Structure Learning with Normalizing Flows

Autor: Kamkari, Hamidreza, Balazadeh, Vahid, Zehtab, Vahid, Krishnan, Rahul G.

Estimating the causal structure of observational data is a challenging combinatorial search problem that scales super-exponentially with graph size. Existing methods use continuous relaxations to make this problem computationally tractable but often

Externí odkaz: http://arxiv.org/abs/2308.07480

Zobrazit plný text záznamu

Report

Partial Identification of Treatment Effects with Implicit Generative Models

Autor: Balazadeh, Vahid, Syrgkanis, Vasilis, Krishnan, Rahul G.

We consider the problem of partial identification, the estimation of bounds on the treatment effects from observational data. Although studied using discrete treatment variables or in specific causal graphs (e.g., instrumental variables), partial ide

Externí odkaz: http://arxiv.org/abs/2210.08139

Zobrazit plný text záznamu

Report

Learning to Switch Among Agents in a Team via 2-Layer Markov Decision Processes

Autor: Balazadeh, Vahid, De, Abir, Singla, Adish, Gomez-Rodriguez, Manuel

Reinforcement learning agents have been mostly developed and evaluated under the assumption that they will operate in a fully autonomous manner -- they will take all actions. In this work, our goal is to develop algorithms that, by learning to switch

Externí odkaz: http://arxiv.org/abs/2002.04258

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání