Zobrazeno 1 - 10
of 2 237
pro vyhledávání: '"A P, Molchanov"'
Autor:
Heinrich, Greg, Ranzinger, Mike, Hongxu, Yin, Lu, Yao, Kautz, Jan, Tao, Andrew, Catanzaro, Bryan, Molchanov, Pavlo
Agglomerative models have recently emerged as a powerful approach to training vision foundation models, leveraging multi-teacher distillation from existing models such as CLIP, DINO, and SAM. This strategy enables the efficient creation of robust mod
Externí odkaz:
http://arxiv.org/abs/2412.07679
Autor:
Liu, Zhijian, Zhu, Ligeng, Shi, Baifeng, Zhang, Zhuoyang, Lou, Yuming, Yang, Shang, Xi, Haocheng, Cao, Shiyi, Gu, Yuxian, Li, Dacheng, Li, Xiuyu, Fang, Yunhao, Chen, Yukang, Hsieh, Cheng-Yu, Huang, De-An, Cheng, An-Chieh, Nath, Vishwesh, Hu, Jinyi, Liu, Sifei, Krishna, Ranjay, Xu, Daguang, Wang, Xiaolong, Molchanov, Pavlo, Kautz, Jan, Yin, Hongxu, Han, Song, Lu, Yao
Visual language models (VLMs) have made significant advances in accuracy in recent years. However, their efficiency has received much less attention. This paper introduces NVILA, a family of open VLMs designed to optimize both efficiency and accuracy
Externí odkaz:
http://arxiv.org/abs/2412.04468
Autor:
Bercovich, Akhiad, Ronen, Tomer, Abramovich, Talor, Ailon, Nir, Assaf, Nave, Dabbah, Mohammad, Galil, Ido, Geifman, Amnon, Geifman, Yonatan, Golan, Izhak, Haber, Netanel, Karpas, Ehud, Koren, Roi, Levy, Itay, Molchanov, Pavlo, Mor, Shahar, Moshe, Zach, Nabwani, Najeeb, Puny, Omri, Rubin, Ran, Schen, Itamar, Shahaf, Ido, Tropp, Oren, Argov, Omer Ullman, Zilberstein, Ran, El-Yaniv, Ran
Large language models (LLMs) have demonstrated remarkable capabilities, but their adoption is limited by high computational costs during inference. While increasing parameter counts enhances accuracy, it also widens the gap between state-of-the-art c
Externí odkaz:
http://arxiv.org/abs/2411.19146
Autor:
Dong, Xin, Fu, Yonggan, Diao, Shizhe, Byeon, Wonmin, Chen, Zijia, Mahabaleshwarkar, Ameya Sunil, Liu, Shih-Yang, Van Keirsbilck, Matthijs, Chen, Min-Hung, Suhara, Yoshi, Lin, Yingyan, Kautz, Jan, Molchanov, Pavlo
We propose Hymba, a family of small language models featuring a hybrid-head parallel architecture that integrates transformer attention mechanisms with state space models (SSMs) for enhanced efficiency. Attention heads provide high-resolution recall,
Externí odkaz:
http://arxiv.org/abs/2411.13676
Autor:
Nath, Vishwesh, Li, Wenqi, Yang, Dong, Myronenko, Andriy, Zheng, Mingxin, Lu, Yao, Liu, Zhijian, Yin, Hongxu, Law, Yee Man, Tang, Yucheng, Guo, Pengfei, Zhao, Can, Xu, Ziyue, He, Yufan, Heinrich, Greg, Aylward, Stephen, Edgar, Marc, Zephyr, Michael, Molchanov, Pavlo, Turkbey, Baris, Roth, Holger, Xu, Daguang
Generalist vision language models (VLMs) have made significant strides in computer vision, but they fall short in specialized fields like healthcare, where expert knowledge is essential. In traditional computer vision tasks, creative or approximate a
Externí odkaz:
http://arxiv.org/abs/2411.12915
Autor:
Liu, Shih-Yang, Yang, Huck, Wang, Chien-Yi, Fung, Nai Chit, Yin, Hongxu, Sakr, Charbel, Muralidharan, Saurav, Cheng, Kwang-Ting, Kautz, Jan, Wang, Yu-Chiang Frank, Molchanov, Pavlo, Chen, Min-Hung
In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users
Externí odkaz:
http://arxiv.org/abs/2410.21271
Patients with schizophrenia often present with cognitive impairments that may hinder their ability to learn about their condition. These individuals could benefit greatly from education platforms that leverage the adaptability of Large Language Model
Externí odkaz:
http://arxiv.org/abs/2410.12848
Autor:
Margarint, Vlad, Molchanov, Stanislav
The first step in the formulation and study of the Riemann Hypothesis is the analytic continuation of the Riemann Zeta Function (RZF) in the full Complex Plane with a pole at $s=1$. In the current work, we study the analytic continuation of two rando
Externí odkaz:
http://arxiv.org/abs/2410.03044
Autor:
Ranzinger, Mike, Barker, Jon, Heinrich, Greg, Molchanov, Pavlo, Catanzaro, Bryan, Tao, Andrew
Various visual foundation models have distinct strengths and weaknesses, both of which can be improved through heterogeneous multi-teacher knowledge distillation without labels, termed "agglomerative models." We build upon this body of work by studyi
Externí odkaz:
http://arxiv.org/abs/2410.01680
Autor:
Fang, Gongfan, Yin, Hongxu, Muralidharan, Saurav, Heinrich, Greg, Pool, Jeff, Kautz, Jan, Molchanov, Pavlo, Wang, Xinchao
Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or ``N:M'') Sparsity in LLM
Externí odkaz:
http://arxiv.org/abs/2409.17481