Výsledky vyhledávání

Report

Scaling Parameter-Constrained Language Models with Quality Data

Autor: Chang, Ernie, Paltenghi, Matteo, Li, Yang, Lin, Pin-Jie, Zhao, Changsheng, Huber, Patrick, Liu, Zechun, Rabatin, Rastislav, Shi, Yangyang, Chandra, Vikas

Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization. In this paper, we

Externí odkaz: http://arxiv.org/abs/2410.03083

Zobrazit plný text záznamu

Report

Target-Aware Language Modeling via Granular Data Sampling

Autor: Chang, Ernie, Lin, Pin-Jie, Li, Yang, Zhao, Changsheng, Kim, Daeil, Rabatin, Rastislav, Liu, Zechun, Shi, Yangyang, Chandra, Vikas

Language model pretraining generally targets a broad range of use cases and incorporates data from diverse sources. However, there are instances where we desire a model that excels in specific areas without markedly compromising performance in other

Externí odkaz: http://arxiv.org/abs/2409.14705

Zobrazit plný text záznamu

Report

RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization

Autor: Huang, Xijie, Liu, Zechun, Liu, Shih-Yang, Cheng, Kwang-Ting

Low-Rank Adaptation (LoRA), as a representative Parameter-Efficient Fine-Tuning (PEFT)method, significantly enhances the training efficiency by updating only a small portion of the weights in Large Language Models (LLMs). Recently, weight-only quanti

Externí odkaz: http://arxiv.org/abs/2407.08044

Zobrazit plný text záznamu

Report

An Introduction to Vision-Language Modeling

Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce

Externí odkaz: http://arxiv.org/abs/2405.17247

Zobrazit plný text záznamu

Report

SpinQuant: LLM quantization with learned rotations

Autor: Liu, Zechun, Zhao, Changsheng, Fedorov, Igor, Soran, Bilge, Choudhary, Dhruv, Krishnamoorthi, Raghuraman, Chandra, Vikas, Tian, Yuandong, Blankevoort, Tijmen

Post-training quantization (PTQ) techniques applied to weights, activations, and the KV cache greatly reduce memory usage, latency, and power consumption of Large Language Models (LLMs), but may lead to large quantization errors when outliers are pre

Externí odkaz: http://arxiv.org/abs/2405.16406

Zobrazit plný text záznamu

Report

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Autor: Liu, Zechun, Zhao, Changsheng, Iandola, Forrest, Lai, Chen, Tian, Yuandong, Fedorov, Igor, Xiong, Yunyang, Chang, Ernie, Shi, Yangyang, Krishnamoorthi, Raghuraman, Lai, Liangzhen, Chandra, Vikas

This paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice

Externí odkaz: http://arxiv.org/abs/2402.14905

Zobrazit plný text záznamu

Report

On The Open Prompt Challenge In Conditional Audio Generation

Autor: Chang, Ernie, Srinivasan, Sidd, Luthra, Mahi, Lin, Pin-Jie, Nagaraja, Varun, Iandola, Forrest, Liu, Zechun, Ni, Zhaoheng, Zhao, Changsheng, Shi, Yangyang, Chandra, Vikas

Text-to-audio generation (TTA) produces audio from a text description, learning from pairs of audio samples and hand-annotated text. However, commercializing audio generation is challenging as user-input prompts are often under-specified when compare

Externí odkaz: http://arxiv.org/abs/2311.00897

Zobrazit plný text záznamu

Report

LLM-FP4: 4-Bit Floating-Point Quantized Transformers

Autor: Liu, Shih-yang, Liu, Zechun, Huang, Xijie, Dong, Pingcheng, Cheng, Kwang-Ting

Publikováno v: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We propose LLM-FP4 for quantizing both weights and activations in large language models (LLMs) down to 4-bit floating-point values, in a post-training manner. Existing post-training quantization (PTQ) solutions are primarily integer-based and struggl

Externí odkaz: http://arxiv.org/abs/2310.16836

Zobrazit plný text záznamu

Report

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Autor: Chen, Jun, Zhu, Deyao, Shen, Xiaoqian, Li, Xiang, Liu, Zechun, Zhang, Pengchuan, Krishnamoorthi, Raghuraman, Chandra, Vikas, Xiong, Yunyang, Elhoseiny, Mohamed

Large language models have shown their remarkable capabilities as a general interface for various language-related applications. Motivated by this, we target to build a unified interface for completing many vision-language tasks including image descr

Externí odkaz: http://arxiv.org/abs/2310.09478

Zobrazit plný text záznamu

Report

Efficient and Robust Quantization-aware Training via Adaptive Coreset Selection

Autor: Huang, Xijie, Liu, Zechun, Liu, Shih-Yang, Cheng, Kwang-Ting

Quantization-aware training (QAT) is a representative model compression method to reduce redundancy in weights and activations. However, most existing QAT methods require end-to-end training on the entire dataset, which suffers from long training tim

Externí odkaz: http://arxiv.org/abs/2306.07215

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání