Výsledky vyhledávání - "Molchanov, Pavlo"

Report

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Autor: Ranzinger, Mike, Barker, Jon, Heinrich, Greg, Molchanov, Pavlo, Catanzaro, Bryan, Tao, Andrew

Various visual foundation models have distinct strengths and weaknesses, both of which can be improved through heterogeneous multi-teacher knowledge distillation without labels, termed "agglomerative models." We build upon this body of work by studyi

Externí odkaz: http://arxiv.org/abs/2410.01680

Zobrazit plný text záznamu

Report

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Autor: Fang, Gongfan, Yin, Hongxu, Muralidharan, Saurav, Heinrich, Greg, Pool, Jeff, Kautz, Jan, Molchanov, Pavlo, Wang, Xinchao

Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or ``N:M'') Sparsity in LLM

Externí odkaz: http://arxiv.org/abs/2409.17481

Zobrazit plný text záznamu

Report

COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation

Autor: Li, Jiefeng, Yuan, Ye, Rempe, Davis, Zhang, Haotian, Molchanov, Pavlo, Lu, Cewu, Kautz, Jan, Iqbal, Umar

Estimating global human motion from moving cameras is challenging due to the entanglement of human and camera motions. To mitigate the ambiguity, existing methods leverage learned human motion priors, which however often result in oversmoothed motion

Externí odkaz: http://arxiv.org/abs/2408.16426

Zobrazit plný text záznamu

Report

LLM Pruning and Distillation in Practice: The Minitron Approach

Autor: Sreenivas, Sharath Turuvekere, Muralidharan, Saurav, Joshi, Raviraj, Chochowski, Marcin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan, Kautz, Jan, Molchanov, Pavlo

We present a comprehensive report on compressing the Llama 3.1 8B and Mistral NeMo 12B models to 4B and 8B parameters, respectively, using pruning and distillation. We explore two distinct pruning strategies: (1) depth pruning and (2) joint hidden/at

Externí odkaz: http://arxiv.org/abs/2408.11796

Zobrazit plný text záznamu

Report

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Autor: Xue, Fuzhao, Chen, Yukang, Li, Dacheng, Hu, Qinghao, Zhu, Ligeng, Li, Xiuyu, Fang, Yunhao, Tang, Haotian, Yang, Shang, Liu, Zhijian, He, Ethan, Yin, Hongxu, Molchanov, Pavlo, Kautz, Jan, Fan, Linxi, Zhu, Yuke, Lu, Yao, Han, Song

Long-context capability is critical for multi-modal foundation models, especially for long video understanding. We introduce LongVILA, a full-stack solution for long-context visual-language models by co-designing the algorithm and system. For model t

Externí odkaz: http://arxiv.org/abs/2408.10188

Zobrazit plný text záznamu

Report

$VILA^2$: VILA Augmented VILA

Autor: Fang, Yunhao, Zhu, Ligeng, Lu, Yao, Wang, Yan, Molchanov, Pavlo, Cho, Jang Hyun, Pavone, Marco, Han, Song, Yin, Hongxu

Visual language models (VLMs) have rapidly progressed, driven by the success of large language models (LLMs). While model architectures and training infrastructures advance rapidly, data curation remains under-explored. When data quantity and quality

Externí odkaz: http://arxiv.org/abs/2407.17453

Zobrazit plný text záznamu

Report

A deeper look at depth pruning of LLMs

Autor: Siddiqui, Shoaib Ahmed, Dong, Xin, Heinrich, Greg, Breuel, Thomas, Kautz, Jan, Krueger, David, Molchanov, Pavlo

Large Language Models (LLMs) are not only resource-intensive to train but even more costly to deploy in production. Therefore, recent work has attempted to prune blocks of LLMs based on cheap proxies for estimating block importance, effectively remov

Externí odkaz: http://arxiv.org/abs/2407.16286

Zobrazit plný text záznamu

Report

Compact Language Models via Pruning and Knowledge Distillation

Autor: Muralidharan, Saurav, Sreenivas, Sharath Turuvekere, Joshi, Raviraj, Chochowski, Marcin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan, Kautz, Jan, Molchanov, Pavlo

Large language models (LLMs) targeting different deployment scales and sizes are currently produced by training each variant from scratch; this is extremely compute-intensive. In this paper, we investigate if pruning an existing LLM and then re-train

Externí odkaz: http://arxiv.org/abs/2407.14679

Zobrazit plný text záznamu

Report

Flextron: Many-in-One Flexible Large Language Model

Autor: Cai, Ruisi, Muralidharan, Saurav, Heinrich, Greg, Yin, Hongxu, Wang, Zhangyang, Kautz, Jan, Molchanov, Pavlo

Training modern LLMs is extremely resource intensive, and customizing them for various deployment scenarios characterized by limited compute and memory resources through repeated training is impractical. In this paper, we introduce Flextron, a networ

Externí odkaz: http://arxiv.org/abs/2406.10260

Zobrazit plný text záznamu

Report

Step Out and Seek Around: On Warm-Start Training with Incremental Data

Autor: Shen, Maying, Yin, Hongxu, Molchanov, Pavlo, Mao, Lei, Alvarez, Jose M.

Data often arrives in sequence over time in real-world deep learning applications such as autonomous driving. When new training data is available, training the model from scratch undermines the benefit of leveraging the learned knowledge, leading to

Externí odkaz: http://arxiv.org/abs/2406.04484

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání