Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Heinrich, Greg"'
Autor:
Siddiqui, Shoaib Ahmed, Dong, Xin, Heinrich, Greg, Breuel, Thomas, Kautz, Jan, Krueger, David, Molchanov, Pavlo
Large Language Models (LLMs) are not only resource-intensive to train but even more costly to deploy in production. Therefore, recent work has attempted to prune blocks of LLMs based on cheap proxies for estimating block importance, effectively remov
Externí odkaz:
http://arxiv.org/abs/2407.16286
Autor:
Cai, Ruisi, Muralidharan, Saurav, Heinrich, Greg, Yin, Hongxu, Wang, Zhangyang, Kautz, Jan, Molchanov, Pavlo
Training modern LLMs is extremely resource intensive, and customizing them for various deployment scenarios characterized by limited compute and memory resources through repeated training is impractical. In this paper, we introduce Flextron, a networ
Externí odkaz:
http://arxiv.org/abs/2406.10260
Publikováno v:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 12490-12500
A handful of visual foundation models (VFMs) have recently emerged as the backbones for numerous downstream tasks. VFMs like CLIP, DINOv2, SAM are trained with distinct objectives, exhibiting unique characteristics for various downstream tasks. We fi
Externí odkaz:
http://arxiv.org/abs/2312.06709
Autor:
Hatamizadeh, Ali, Heinrich, Greg, Yin, Hongxu, Tao, Andrew, Alvarez, Jose M., Kautz, Jan, Molchanov, Pavlo
We design a new family of hybrid CNN-ViT neural networks, named FasterViT, with a focus on high image throughput for computer vision (CV) applications. FasterViT combines the benefits of fast local representation learning in CNNs and global modeling
Externí odkaz:
http://arxiv.org/abs/2306.06189
We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision. Our method leverages global context self-attention modules, joint with standard local self-attention, to
Externí odkaz:
http://arxiv.org/abs/2206.09959
Autor:
Heinrich, Greg, Frosio, Iuri
Training intelligent agents through reinforcement learning is a notoriously unstable procedure. Massive parallelization on GPUs and distributed systems has been exploited to generate a large amount of training experiences and consequently reduce inst
Externí odkaz:
http://arxiv.org/abs/1902.02725