Výsledky vyhledávání

Report

Seed-CTS: Unleashing the Power of Tree Search for Superior Performance in Competitive Coding Tasks

Autor: Wang, Hao, Liu, Boyi, Zhang, Yufeng, Chen, Jie

Competition-level code generation tasks pose significant challenges for current state-of-the-art large language models (LLMs). For example, on the LiveCodeBench-Hard dataset, models such as O1-Mini and O1-Preview achieve pass@1 rates of only 0.366 an

Externí odkaz: http://arxiv.org/abs/2412.12544

Zobrazit plný text záznamu

Report

FullStack Bench: Evaluating LLMs as Full Stack Coders

As the capabilities of code large language models (LLMs) continue to expand, their applications across diverse code intelligence domains are rapidly increasing. However, most existing datasets only evaluate limited application domains. To address thi

Externí odkaz: http://arxiv.org/abs/2412.00535

Zobrazit plný text záznamu

Report

DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs

Autor: Liu, Zhihan, Zhang, Shenao, Liu, Yongfei, Liu, Boyi, Yang, Yingxiang, Wang, Zhaoran

Direct preference learning offers a promising and computation-efficient beyond supervised fine-tuning (SFT) for improving code generation in coding large language models (LMs). However, the scarcity of reliable preference data is a bottleneck for the

Externí odkaz: http://arxiv.org/abs/2411.13611

Zobrazit plný text záznamu

Report

LLM-Slice: Dedicated Wireless Network Slicing for Large Language Models

Autor: Liu, Boyi, Tong, Jingwen, Zhang, Jun

The rapid adoption of large language models (LLMs) presents new challenges for existing network architectures due to significant peak traffic and high communication uncertainty. Traditional wireless networks struggle to support efficiently, leading t

Externí odkaz: http://arxiv.org/abs/2410.18499

Zobrazit plný text záznamu

Report

BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data

Autor: Wang, Xuwu, Cui, Qiwen, Tao, Yunzhe, Wang, Yiran, Chai, Ziwei, Han, Xiaotian, Liu, Boyi, Yuan, Jianbo, Su, Jing, Wang, Guoyin, Liu, Tingkai, Chen, Liyu, Liu, Tianyi, Sun, Tao, Zhang, Yufeng, Zheng, Sirui, You, Quanzeng, Yang, Yang, Yang, Hongxia

Large language models (LLMs) have become increasingly pivotal across various domains, especially in handling complex data types. This includes structured data processing, as exemplified by ChartQA and ChatGPT-Ada, and multimodal unstructured data pro

Externí odkaz: http://arxiv.org/abs/2410.00773

Zobrazit plný text záznamu

Report

FuncEvalGMN: Evaluating Functional Correctness of SQL via Graph Matching Network

Autor: Zhan, Yi, Sun, Yang, Weng, Han, Cui, Longjie, Wang, Guifeng, Xie, Jiajun, Tian, Yu, Yin, Xiaoming, Liu, Boyi, Huang, Dongchi

In this paper, we propose a novel graph-based methodology to evaluate the functional correctness of SQL generation. Conventional metrics for assessing SQL code generation, such as matching-based and execution-based methods (e.g., exact set match and

Externí odkaz: http://arxiv.org/abs/2407.14530

Zobrazit plný text záznamu

Report

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer

Autor: Liu, Zhihan, Lu, Miao, Zhang, Shenao, Liu, Boyi, Guo, Hongyi, Yang, Yingxiang, Blanchet, Jose, Wang, Zhaoran

Aligning generative models with human preference via RLHF typically suffers from overoptimization, where an imperfectly learned reward model can misguide the generative model to output undesired responses. We investigate this problem in a principled

Externí odkaz: http://arxiv.org/abs/2405.16436

Zobrazit plný text záznamu

Report

EdgeLoc: A Communication-Adaptive Parallel System for Real-Time Localization in Infrastructure-Assisted Autonomous Driving

Autor: Liu, Boyi, Tong, Jingwen, Zhuang, Yufan

This paper presents EdgeLoc, an infrastructure-assisted, real-time localization system for autonomous driving that addresses the incompatibility between traditional localization methods and deep learning approaches. The system is built on top of the

Externí odkaz: http://arxiv.org/abs/2405.12120

Zobrazit plný text záznamu

Report

$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model

Autor: Zhang, Yufeng, Chen, Liyu, Liu, Boyi, Yang, Yingxiang, Cui, Qiwen, Tao, Yunzhe, Yang, Hongxia

Recent advances in reinforcement learning (RL) algorithms aim to enhance the performance of language models at scale. Yet, there is a noticeable absence of a cost-effective and standardized testbed tailored to evaluating and comparing these algorithm

Externí odkaz: http://arxiv.org/abs/2403.07191

Zobrazit plný text záznamu

Report

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning

Autor: Li, Zihao, Liu, Boyi, Yang, Zhuoran, Wang, Zhaoran, Wang, Mengdi

We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, in

Externí odkaz: http://arxiv.org/abs/2402.10810

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání