Zobrazeno 1 - 10
of 132
pro vyhledávání: '"Molchanov, Pavlo A."'
Autor:
Zhang, Hanning, Wang, Pengcheng, Diao, Shizhe, Lin, Yong, Pan, Rui, Dong, Hanze, Zhang, Dylan, Molchanov, Pavlo, Zhang, Tong
Large language models (LLMs) have shown promise in performing complex multi-step reasoning, yet they continue to struggle with mathematical reasoning, often making systematic errors. A promising solution is reinforcement learning (RL) guided by rewar
Externí odkaz:
http://arxiv.org/abs/2412.11006
Autor:
Heinrich, Greg, Ranzinger, Mike, Hongxu, Yin, Lu, Yao, Kautz, Jan, Tao, Andrew, Catanzaro, Bryan, Molchanov, Pavlo
Agglomerative models have recently emerged as a powerful approach to training vision foundation models, leveraging multi-teacher distillation from existing models such as CLIP, DINO, and SAM. This strategy enables the efficient creation of robust mod
Externí odkaz:
http://arxiv.org/abs/2412.07679
Autor:
Liu, Zhijian, Zhu, Ligeng, Shi, Baifeng, Zhang, Zhuoyang, Lou, Yuming, Yang, Shang, Xi, Haocheng, Cao, Shiyi, Gu, Yuxian, Li, Dacheng, Li, Xiuyu, Fang, Yunhao, Chen, Yukang, Hsieh, Cheng-Yu, Huang, De-An, Cheng, An-Chieh, Nath, Vishwesh, Hu, Jinyi, Liu, Sifei, Krishna, Ranjay, Xu, Daguang, Wang, Xiaolong, Molchanov, Pavlo, Kautz, Jan, Yin, Hongxu, Han, Song, Lu, Yao
Visual language models (VLMs) have made significant advances in accuracy in recent years. However, their efficiency has received much less attention. This paper introduces NVILA, a family of open VLMs designed to optimize both efficiency and accuracy
Externí odkaz:
http://arxiv.org/abs/2412.04468
Autor:
Bercovich, Akhiad, Ronen, Tomer, Abramovich, Talor, Ailon, Nir, Assaf, Nave, Dabbah, Mohammad, Galil, Ido, Geifman, Amnon, Geifman, Yonatan, Golan, Izhak, Haber, Netanel, Karpas, Ehud, Koren, Roi, Levy, Itay, Molchanov, Pavlo, Mor, Shahar, Moshe, Zach, Nabwani, Najeeb, Puny, Omri, Rubin, Ran, Schen, Itamar, Shahaf, Ido, Tropp, Oren, Argov, Omer Ullman, Zilberstein, Ran, El-Yaniv, Ran
Large language models (LLMs) have demonstrated remarkable capabilities, but their adoption is limited by high computational costs during inference. While increasing parameter counts enhances accuracy, it also widens the gap between state-of-the-art c
Externí odkaz:
http://arxiv.org/abs/2411.19146
Autor:
Dong, Xin, Fu, Yonggan, Diao, Shizhe, Byeon, Wonmin, Chen, Zijia, Mahabaleshwarkar, Ameya Sunil, Liu, Shih-Yang, Van Keirsbilck, Matthijs, Chen, Min-Hung, Suhara, Yoshi, Lin, Yingyan, Kautz, Jan, Molchanov, Pavlo
We propose Hymba, a family of small language models featuring a hybrid-head parallel architecture that integrates transformer attention mechanisms with state space models (SSMs) for enhanced efficiency. Attention heads provide high-resolution recall,
Externí odkaz:
http://arxiv.org/abs/2411.13676
Autor:
Nath, Vishwesh, Li, Wenqi, Yang, Dong, Myronenko, Andriy, Zheng, Mingxin, Lu, Yao, Liu, Zhijian, Yin, Hongxu, Law, Yee Man, Tang, Yucheng, Guo, Pengfei, Zhao, Can, Xu, Ziyue, He, Yufan, Heinrich, Greg, Aylward, Stephen, Edgar, Marc, Zephyr, Michael, Molchanov, Pavlo, Turkbey, Baris, Roth, Holger, Xu, Daguang
Generalist vision language models (VLMs) have made significant strides in computer vision, but they fall short in specialized fields like healthcare, where expert knowledge is essential. In traditional computer vision tasks, creative or approximate a
Externí odkaz:
http://arxiv.org/abs/2411.12915
Autor:
Liu, Shih-Yang, Yang, Huck, Wang, Chien-Yi, Fung, Nai Chit, Yin, Hongxu, Sakr, Charbel, Muralidharan, Saurav, Cheng, Kwang-Ting, Kautz, Jan, Wang, Yu-Chiang Frank, Molchanov, Pavlo, Chen, Min-Hung
In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users
Externí odkaz:
http://arxiv.org/abs/2410.21271
Autor:
Ranzinger, Mike, Barker, Jon, Heinrich, Greg, Molchanov, Pavlo, Catanzaro, Bryan, Tao, Andrew
Various visual foundation models have distinct strengths and weaknesses, both of which can be improved through heterogeneous multi-teacher knowledge distillation without labels, termed "agglomerative models." We build upon this body of work by studyi
Externí odkaz:
http://arxiv.org/abs/2410.01680
Autor:
Fang, Gongfan, Yin, Hongxu, Muralidharan, Saurav, Heinrich, Greg, Pool, Jeff, Kautz, Jan, Molchanov, Pavlo, Wang, Xinchao
Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or ``N:M'') Sparsity in LLM
Externí odkaz:
http://arxiv.org/abs/2409.17481
Autor:
Li, Jiefeng, Yuan, Ye, Rempe, Davis, Zhang, Haotian, Molchanov, Pavlo, Lu, Cewu, Kautz, Jan, Iqbal, Umar
Estimating global human motion from moving cameras is challenging due to the entanglement of human and camera motions. To mitigate the ambiguity, existing methods leverage learned human motion priors, which however often result in oversmoothed motion
Externí odkaz:
http://arxiv.org/abs/2408.16426