Výsledky vyhledávání - "Zhu, Banghua"

Report

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Autor: Li, Tianle, Chiang, Wei-Lin, Frick, Evan, Dunlap, Lisa, Wu, Tianhao, Zhu, Banghua, Gonzalez, Joseph E., Stoica, Ion

The rapid evolution of language models has necessitated the development of more challenging benchmarks. Current static benchmarks often struggle to consistently distinguish between the capabilities of different models and fail to align with real-worl

Externí odkaz: http://arxiv.org/abs/2406.11939

Zobrazit plný text záznamu

Report

Noisy Computing of the Threshold Function

Autor: Wang, Ziao, Ghaddar, Nadim, Zhu, Banghua, Wang, Lele

Let $\mathsf{TH}_k$ denote the $k$-out-of-$n$ threshold function: given $n$ input Boolean variables, the output is $1$ if and only if at least $k$ of the inputs are $1$. We consider the problem of computing the $\mathsf{TH}_k$ function using noisy re

Externí odkaz: http://arxiv.org/abs/2403.07227

Zobrazit plný text záznamu

Report

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Autor: Chiang, Wei-Lin, Zheng, Lianmin, Sheng, Ying, Angelopoulos, Anastasios Nikolas, Li, Tianle, Li, Dacheng, Zhang, Hao, Zhu, Banghua, Jordan, Michael, Gonzalez, Joseph E., Stoica, Ion

Large Language Models (LLMs) have unlocked new capabilities and applications; however, evaluating the alignment with human preferences still poses significant challenges. To address this issue, we introduce Chatbot Arena, an open platform for evaluat

Externí odkaz: http://arxiv.org/abs/2403.04132

Zobrazit plný text záznamu

Report

Generative AI Security: Challenges and Countermeasures

Autor: Zhu, Banghua, Mu, Norman, Jiao, Jiantao, Wagner, David

Generative AI's expanding footprint across numerous industries has led to both excitement and increased scrutiny. This paper delves into the unique security challenges posed by Generative AI, and outlines potential research directions for managing th

Externí odkaz: http://arxiv.org/abs/2402.12617

Zobrazit plný text záznamu

Report

Efficient Prompt Caching via Embedding Similarity

Autor: Zhu, Hanlin, Zhu, Banghua, Jiao, Jiantao

Large language models (LLMs) have achieved huge success in numerous natural language process (NLP) tasks. However, it faces the challenge of significant resource consumption during inference. In this paper, we aim to improve the inference efficiency

Externí odkaz: http://arxiv.org/abs/2402.01173

Zobrazit plný text záznamu

Report

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

Autor: Zhu, Banghua, Jordan, Michael I., Jiao, Jiantao

Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique that aligns language models closely with human-centric values. The initial phase of RLHF involves learning human values using a reward model from ranking data. It is observed th

Externí odkaz: http://arxiv.org/abs/2401.16335

Zobrazit plný text záznamu

Report

Fairness in Serving Large Language Models

Autor: Sheng, Ying, Cao, Shiyi, Li, Dacheng, Zhu, Banghua, Li, Zhuohan, Zhuo, Danyang, Gonzalez, Joseph E., Stoica, Ion

High-demand LLM inference services (e.g., ChatGPT and BARD) support a wide range of requests from short chat conversations to long document reading. To ensure that all client requests are processed fairly, most major LLM inference services have reque

Externí odkaz: http://arxiv.org/abs/2401.00588

Zobrazit plný text záznamu

Report

The Effective Horizon Explains Deep RL Performance in Stochastic Environments

Autor: Laidlaw, Cassidy, Zhu, Banghua, Russell, Stuart, Dragan, Anca

Publikováno v: ICLR 2024 (Spotlight)

Reinforcement learning (RL) theory has largely focused on proving minimax sample complexity bounds. These require strategic exploration algorithms that use relatively limited function classes for representing the policy or value function. Our goal is

Externí odkaz: http://arxiv.org/abs/2312.08369

Zobrazit plný text záznamu

Report

Towards Optimal Statistical Watermarking

Autor: Huang, Baihe, Zhu, Hanlin, Zhu, Banghua, Ramchandran, Kannan, Jordan, Michael I., Lee, Jason D., Jiao, Jiantao

We study statistical watermarking by formulating it as a hypothesis testing problem, a general framework which subsumes all previous statistical watermarking methods. Key to our formulation is a coupling of the output tokens and the rejection region,

Externí odkaz: http://arxiv.org/abs/2312.07930

Zobrazit plný text záznamu

Report

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Autor: Sheng, Ying, Cao, Shiyi, Li, Dacheng, Hooper, Coleman, Lee, Nicholas, Yang, Shuo, Chou, Christopher, Zhu, Banghua, Zheng, Lianmin, Keutzer, Kurt, Gonzalez, Joseph E., Stoica, Ion

The "pretrain-then-finetune" paradigm is commonly adopted in the deployment of large language models. Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method, is often employed to adapt a base model to a multitude of tasks, resulting in

Externí odkaz: http://arxiv.org/abs/2311.03285

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání