Výsledky vyhledávání - "Gu, Quanquan"

Report

Uncertainty-Aware Reward-Free Exploration with General Function Approximation

Autor: Zhang, Junkai, Zhang, Weitong, Zhou, Dongruo, Gu, Quanquan

Mastering multiple tasks through exploration and learning in an environment poses a significant challenge in reinforcement learning (RL). Unsupervised RL has been introduced to address this challenge by training policies with intrinsic rewards rather

Externí odkaz: http://arxiv.org/abs/2406.16255

Zobrazit plný text záznamu

Report

Self-Play Preference Optimization for Language Model Alignment

Autor: Wu, Yue, Sun, Zhiqing, Yuan, Huizhuo, Ji, Kaixuan, Yang, Yiming, Gu, Quanquan

Traditional reinforcement learning from human feedback (RLHF) approaches relying on parametric models like the Bradley-Terry model fall short in capturing the intransitivity and irrationality in human preferences. Recent advancements suggest that dir

Externí odkaz: http://arxiv.org/abs/2405.00675

Zobrazit plný text záznamu

Report

Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent

Autor: Kou, Yiwen, Chen, Zixiang, Gu, Quanquan, Kakade, Sham M.

The $k$-parity problem is a classical problem in computational complexity and algorithmic theory, serving as a key benchmark for understanding computational classes. In this paper, we solve the $k$-parity problem with stochastic gradient descent (SGD

Externí odkaz: http://arxiv.org/abs/2404.12376

Zobrazit plný text záznamu

Report

Guided Discrete Diffusion for Electronic Health Record Generation

Autor: Han, Jun, Chen, Zixiang, Li, Yongqian, Kou, Yiwen, Halperin, Eran, Tillman, Robert E., Gu, Quanquan

Electronic health records (EHRs) are a pivotal data source that enables numerous applications in computational medicine, e.g., disease progression prediction, clinical trial design, and health economics and outcomes research. Despite wide usability,

Externí odkaz: http://arxiv.org/abs/2404.12314

Zobrazit plný text záznamu

Report

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

Autor: Di, Qiwei, He, Jiafan, Gu, Quanquan

Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM). However, the effectiveness of this approach can be influenced by adversaries, who may intentionally provide misleading preference

Externí odkaz: http://arxiv.org/abs/2404.10776

Zobrazit plný text záznamu

Report

Settling Constant Regrets in Linear Markov Decision Processes

Autor: Zhang, Weitong, Fan, Zhiyuan, He, Jiafan, Gu, Quanquan

We study the constant regret guarantees in reinforcement learning (RL). Our objective is to design an algorithm that incurs only finite regret over infinite episodes with high probability. We introduce an algorithm, Cert-LSVI-UCB, for misspecified li

Externí odkaz: http://arxiv.org/abs/2404.10745

Zobrazit plný text záznamu

Report

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Autor: Li, Xuheng, Zhao, Heyang, Gu, Quanquan

Contextual dueling bandits, where a learner compares two options based on context and receives feedback indicating which was preferred, extends classic dueling bandits by incorporating contextual information for decision-making and preference learnin

Externí odkaz: http://arxiv.org/abs/2404.06013

Zobrazit plný text záznamu

Report

Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization

Autor: Zhou, Xiangxin, Xue, Dongyu, Chen, Ruizhe, Zheng, Zaixiang, Wang, Liang, Gu, Quanquan

Antibody design, a crucial task with significant implications across various disciplines such as therapeutics and biology, presents considerable challenges due to its intricate nature. In this paper, we tackle antigen-specific antibody sequence-struc

Externí odkaz: http://arxiv.org/abs/2403.16576

Zobrazit plný text záznamu

Report

Protein Conformation Generation via Force-Guided SE(3) Diffusion Models

Autor: Wang, Yan, Wang, Lihao, Shen, Yuning, Wang, Yiqun, Yuan, Huizhuo, Wu, Yue, Gu, Quanquan

The conformational landscape of proteins is crucial to understanding their functionality in complex biological processes. Traditional physics-based computational methods, such as molecular dynamics (MD) simulations, suffer from rare event sampling an

Externí odkaz: http://arxiv.org/abs/2403.14088

Zobrazit plný text záznamu

Report

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Autor: Zhou, Xiangxin, Cheng, Xiwei, Yang, Yuwei, Bao, Yu, Wang, Liang, Gu, Quanquan

Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals

Externí odkaz: http://arxiv.org/abs/2403.13829

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání