Výsledky vyhledávání - "Jiang, Huiqiang"

Report

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Autor: Jiang, Huiqiang, Li, Yucheng, Zhang, Chengruidong, Wu, Qianhui, Luo, Xufang, Ahn, Surin, Han, Zhenhua, Abdi, Amir H., Li, Dongsheng, Lin, Chin-Yew, Yang, Yuqing, Qiu, Lili

The computational challenges of Large Language Model (LLM) inference remain a significant barrier to their widespread deployment, especially as prompt lengths continue to increase. Due to the quadratic complexity of the attention computation, it take

Externí odkaz: http://arxiv.org/abs/2407.02490

Zobrazit plný text záznamu

Report

Mitigate Position Bias in Large Language Models via Scaling a Single Dimension

Autor: Yu, Yijiong, Jiang, Huiqiang, Luo, Xufang, Wu, Qianhui, Lin, Chin-Yew, Li, Dongsheng, Yang, Yuqing, Huang, Yongfeng, Qiu, Lili

Large Language Models (LLMs) are increasingly applied in various real-world scenarios due to their excellent generalization capabilities and robust generative abilities. However, they exhibit position bias, also known as "lost in the middle", a pheno

Externí odkaz: http://arxiv.org/abs/2406.02536

Zobrazit plný text záznamu

Report

Position Engineering: Boosting Large Language Models through Positional Information Manipulation

Autor: He, Zhiyuan, Jiang, Huiqiang, Wang, Zilong, Yang, Yuqing, Qiu, Luna, Qiu, Lili

The performance of large language models (LLMs) is significantly influenced by the quality of the prompts provided. In response, researchers have developed enormous prompt engineering strategies aimed at modifying the prompt text to enhance task perf

Externí odkaz: http://arxiv.org/abs/2404.11216

Zobrazit plný text záznamu

Report

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Autor: Pan, Zhuoshi, Wu, Qianhui, Jiang, Huiqiang, Xia, Menglin, Luo, Xufang, Zhang, Jue, Lin, Qingwei, Rühle, Victor, Yang, Yuqing, Lin, Chin-Yew, Zhao, H. Vicky, Qiu, Lili, Zhang, Dongmei

This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. Considering the redundancy in natural language, existing approaches compress prompts by removing tokens or lexical units according to their information

Externí odkaz: http://arxiv.org/abs/2403.12968

Zobrazit plný text záznamu

Report

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

Autor: Jiang, Huiqiang, Wu, Qianhui, Luo, Xufang, Li, Dongsheng, Lin, Chin-Yew, Yang, Yuqing, Qiu, Lili

In long context scenarios, large language models (LLMs) face three main challenges: higher computational cost, performance reduction, and position bias. Research indicates that LLM performance hinges on the density and position of key information in

Externí odkaz: http://arxiv.org/abs/2310.06839

Zobrazit plný text záznamu

Report

LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models

Autor: Jiang, Huiqiang, Wu, Qianhui, Lin, Chin-Yew, Yang, Yuqing, Qiu, Lili

Large language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs are becomi

Externí odkaz: http://arxiv.org/abs/2310.05736

Zobrazit plný text záznamu

Report

End-to-End Word-Level Pronunciation Assessment with MASK Pre-training

Autor: Liang, Yukang, Song, Kaitao, Mao, Shaoguang, Jiang, Huiqiang, Qiu, Luna, Yang, Yuqing, Li, Dongsheng, Xu, Linli, Qiu, Lili

Pronunciation assessment is a major challenge in the computer-aided pronunciation training system, especially at the word (phoneme)-level. To obtain word (phoneme)-level scores, current methods usually rely on aligning components to obtain acoustic f

Externí odkaz: http://arxiv.org/abs/2306.02682

Zobrazit plný text záznamu

Report

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Autor: Jiang, Huiqiang, Zhang, Li Lyna, Li, Yuang, Wu, Yu, Cao, Shijie, Cao, Ting, Yang, Yuqing, Li, Jinyu, Yang, Mao, Qiu, Lili

Automatic Speech Recognition (ASR) has seen remarkable advancements with deep neural networks, such as Transformer and Conformer. However, these models typically have large model sizes and high inference costs, posing a challenge to deploy on resourc

Externí odkaz: http://arxiv.org/abs/2305.19549

Zobrazit plný text záznamu

Report

CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition

Autor: Ma, Tingting, Wu, Qianhui, Jiang, Huiqiang, Karlsson, Börje F., Zhao, Tiejun, Lin, Chin-Yew

Cross-lingual named entity recognition (NER) aims to train an NER system that generalizes well to a target language by leveraging labeled data in a given source language. Previous work alleviates the data scarcity problem by translating source-langua

Externí odkaz: http://arxiv.org/abs/2305.14913

Zobrazit plný text záznamu

Report

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Autor: Tang, Chen, Zhang, Li Lyna, Jiang, Huiqiang, Xu, Jiahang, Cao, Ting, Zhang, Quanlu, Yang, Yuqing, Wang, Zhi, Yang, Mao

Neural Architecture Search (NAS) has shown promising performance in the automatic design of vision transformers (ViT) exceeding 1G FLOPs. However, designing lightweight and low-latency ViT models for diverse mobile devices remains a big challenge. In

Externí odkaz: http://arxiv.org/abs/2303.09730

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání