Výsledky vyhledávání

Report

SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

Autor: Xia, Heming, Li, Yongqi, Zhang, Jun, Du, Cunxiao, Li, Wenjie

Speculative decoding (SD) has emerged as a widely used paradigm to accelerate the inference of large language models (LLMs) without compromising generation quality. It works by first employing a compact model to draft multiple tokens efficiently and

Externí odkaz: http://arxiv.org/abs/2410.06916

Zobrazit plný text záznamu

Report

Enhancing Tool Retrieval with Iterative Feedback from Large Language Models

Autor: Xu, Qiancheng, Li, Yongqi, Xia, Heming, Li, Wenjie

Tool learning aims to enhance and expand large language models' (LLMs) capabilities with external tools, which has gained significant attention recently. Current methods have shown that LLMs can effectively handle a certain amount of tools through in

Externí odkaz: http://arxiv.org/abs/2406.17465

Zobrazit plný text záznamu

Report

Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens

Autor: Luo, Weiyao, Zheng, Suncong, Xia, Heming, Wang, Weikang, Lei, Yan, Liu, Tianyu, Chen, Shuang, Sui, Zhifang

Large language models (LLMs) have shown promising efficacy across various tasks, becoming powerful tools in numerous aspects of human life. However, Transformer-based LLMs suffer a performance degradation when modeling long-term contexts due to they

Externí odkaz: http://arxiv.org/abs/2406.10985

Zobrazit plný text záznamu

Report

Can Large Multimodal Models Uncover Deep Semantics Behind Images?

Autor: Yang, Yixin, Li, Zheng, Dong, Qingxiu, Xia, Heming, Sui, Zhifang

Understanding the deep semantics of images is essential in the era dominated by social media. However, current research works primarily on the superficial description of images, revealing a notable deficiency in the systematic investigation of the in

Externí odkaz: http://arxiv.org/abs/2402.11281

Zobrazit plný text záznamu

Report

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding

Autor: Xia, Heming, Yang, Zhe, Dong, Qingxiu, Wang, Peiyi, Li, Yongqi, Ge, Tao, Liu, Tianyu, Li, Wenjie, Sui, Zhifang

To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference. In each decoding step, this method first drafts several fut

Externí odkaz: http://arxiv.org/abs/2401.07851

Zobrazit plný text záznamu

Report

Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization

Autor: Tong, Shoujie, Xia, Heming, Dai, Damai, Xu, Runxin, Liu, Tianyu, Lin, Binghuai, Cao, Yunbo, Sui, Zhifang

Pretrained language models have achieved remarkable success in natural language understanding. However, fine-tuning pretrained models on limited training data tends to overfit and thus diminish performance. This paper presents Bi-Drop, a fine-tuning

Externí odkaz: http://arxiv.org/abs/2305.14760

Zobrazit plný text záznamu

Report

ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

Autor: Xia, Heming, Dong, Qingxiu, Li, Lei, Xu, Jingjing, Liu, Tianyu, Qin, Ziwei, Sui, Zhifang

Recently, Large Language Models (LLMs) have been serving as general-purpose interfaces, posing a significant demand for comprehensive visual knowledge. However, it remains unclear how well current LLMs and their visually augmented counterparts (VaLMs

Externí odkaz: http://arxiv.org/abs/2305.15028

Zobrazit plný text záznamu

Report

Enhancing Continual Relation Extraction via Classifier Decomposition

Autor: Xia, Heming, Wang, Peiyi, Liu, Tianyu, Lin, Binghuai, Cao, Yunbo, Sui, Zhifang

Continual relation extraction (CRE) models aim at handling emerging new relations while avoiding catastrophically forgetting old ones in the streaming data. Though improvements have been shown by previous CRE studies, most of them only adopt a vanill

Externí odkaz: http://arxiv.org/abs/2305.04636

Zobrazit plný text záznamu

Report

A Survey on In-context Learning

Autor: Dong, Qingxiu, Li, Lei, Dai, Damai, Zheng, Ce, Ma, Jingyuan, Li, Rui, Xia, Heming, Xu, Jingjing, Wu, Zhiyong, Liu, Tianyu, Chang, Baobao, Sun, Xu, Sui, Zhifang

With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been

Externí odkaz: http://arxiv.org/abs/2301.00234

Zobrazit plný text záznamu

Report

Lossless Acceleration for Seq2seq Generation with Aggressive Decoding

Autor: Ge, Tao, Xia, Heming, Sun, Xin, Chen, Si-Qing, Wei, Furu

We study lossless acceleration for seq2seq generation with a novel decoding algorithm -- Aggressive Decoding. Unlike the previous efforts (e.g., non-autoregressive decoding) speeding up seq2seq generation at the cost of quality loss, our approach aim

Externí odkaz: http://arxiv.org/abs/2205.10350

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání