Zobrazeno 1 - 10
of 593
pro vyhledávání: '"Luan, Jian An"'
Many positional encodings (PEs) are designed to exhibit long-term decay, based on an entrenched and long-standing inductive opinion: tokens farther away from the current position carry less relevant information. We argue that long-term decay is outda
Externí odkaz:
http://arxiv.org/abs/2410.21216
Low-rank adaptation (LoRA) and its variants have recently gained much interest due to their ability to avoid excessive inference costs. However, LoRA still encounters the following challenges: (1) Limitation of low-rank assumption; and (2) Its initia
Externí odkaz:
http://arxiv.org/abs/2409.16722
ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback
Recently, tool-augmented LLMs have gained increasing attention. Given an instruction, tool-augmented LLMs can interact with various external tools in multiple rounds and provide a final answer. However, previous LLMs were trained on overly detailed i
Externí odkaz:
http://arxiv.org/abs/2409.14826
Autor:
Wu, Qinzhuo, Xu, Weikai, Liu, Wei, Tan, Tao, Liu, Jianfeng, Li, Ang, Luan, Jian, Wang, Bin, Shang, Shuo
Recently, mobile AI agents based on VLMs have been gaining increasing attention. These works typically utilize VLM as a foundation, fine-tuning it with instruction-based mobile datasets. However, these VLMs are typically pre-trained on general-domain
Externí odkaz:
http://arxiv.org/abs/2409.14818
The Sparsely-Activated Mixture-of-Experts (MoE) has gained increasing popularity for scaling up large language models (LLMs) without exploding computational costs. Despite its success, the current design faces a challenge where all experts have the s
Externí odkaz:
http://arxiv.org/abs/2409.12210
Recent advances in large vision-language models (VLMs) typically employ vision encoders based on the Vision Transformer (ViT) architecture. The division of the images into patches by ViT results in a fragmented perception, thereby hindering the visua
Externí odkaz:
http://arxiv.org/abs/2408.16224
Structured pruning fundamentally reduces computational and memory overheads of large language models (LLMs) and offers a feasible solution for end-side LLM deployment. Structurally pruned models remain dense and high-precision, highly compatible with
Externí odkaz:
http://arxiv.org/abs/2407.05690
Autor:
Deng, Shihan, Xu, Weikai, Sun, Hongda, Liu, Wei, Tan, Tao, Liu, Jianfeng, Li, Ang, Luan, Jian, Wang, Bin, Yan, Rui, Shang, Shuo
With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agen
Externí odkaz:
http://arxiv.org/abs/2407.00993
Autor:
Wang, Quandong, Yuan, Yuxuan, Yang, Xiaoyu, Zhang, Ruike, Zhao, Kang, Liu, Wei, Luan, Jian, Povey, Daniel, Wang, Bin
While Large Language Models (LLMs) have achieved remarkable success in various fields, the efficiency of training and inference remains a major challenge. To address this issue, we propose SUBLLM, short for Subsampling-Upsampling-Bypass Large Languag
Externí odkaz:
http://arxiv.org/abs/2406.06571
Publikováno v:
In Proceedings of LREC-COLING 2024, pages 16263-16273
Tool learning aims to extend the capabilities of large language models (LLMs) with external tools. A major challenge in tool learning is how to support a large number of tools, including unseen tools. To address this challenge, previous studies have
Externí odkaz:
http://arxiv.org/abs/2403.06551