Výsledky vyhledávání - "TULI, SHIKHAR"

Report

MoDeGPT: Modular Decomposition for Large Language Model Compression

Autor: Lin, Chi-Heng, Gao, Shangqian, Smith, James Seale, Patel, Abhishek, Tuli, Shikhar, Shen, Yilin, Jin, Hongxia, Hsu, Yen-Chang

Large Language Models (LLMs) have reshaped the landscape of artificial intelligence by demonstrating exceptional performance across various tasks. However, substantial computational requirements make their deployment challenging on devices with limit

Externí odkaz: http://arxiv.org/abs/2408.09632

Zobrazit plný text záznamu

Report

DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling

Autor: Tuli, Shikhar, Lin, Chi-Heng, Hsu, Yen-Chang, Jha, Niraj K., Shen, Yilin, Jin, Hongxia

Traditional language models operate autoregressively, i.e., they predict one token at a time. Rapid explosion in model sizes has resulted in high inference times. In this work, we propose DynaMo, a suite of multi-token prediction language models that

Externí odkaz: http://arxiv.org/abs/2405.00888

Zobrazit plný text záznamu

Report

BREATHE: Second-Order Gradients and Heteroscedastic Emulation based Design Space Exploration

Autor: Tuli, Shikhar, Jha, Niraj K.

Researchers constantly strive to explore larger and more complex search spaces in various scientific studies and physical experiments. However, such investigations often involve sophisticated simulators or time-consuming experiments that make explori

Externí odkaz: http://arxiv.org/abs/2308.08666

Zobrazit plný text záznamu

Report

TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference

Autor: Tuli, Shikhar, Jha, Niraj K.

Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constr

Externí odkaz: http://arxiv.org/abs/2303.14882

Zobrazit plný text záznamu

Report

EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms

Autor: Tuli, Shikhar, Jha, Niraj K.

Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while searching for the best-performing transformer architecture. Furthermore,

Externí odkaz: http://arxiv.org/abs/2303.13745

Zobrazit plný text záznamu

Report

AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers

Autor: Tuli, Shikhar, Jha, Niraj K.

Self-attention-based transformer models have achieved tremendous success in the domain of natural language processing. Despite their efficacy, accelerating the transformer is challenging due to its quadratic computational complexity and large activat

Externí odkaz: http://arxiv.org/abs/2302.14705

Zobrazit plný text záznamu

Report

CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework

Autor: Tuli, Shikhar, Li, Chia-Hao, Sharma, Ritvik, Jha, Niraj K.

Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ su

Externí odkaz: http://arxiv.org/abs/2212.03965

Zobrazit plný text záznamu

Report

FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?

Autor: Tuli, Shikhar, Dedhia, Bhishma, Tuli, Shreshth, Jha, Niraj K.

The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. Training such models and explori

Externí odkaz: http://arxiv.org/abs/2205.11656

Zobrazit plný text záznamu

Report

Generative Optimization Networks for Memory Efficient Data Generation

Autor: Tuli, Shreshth, Tuli, Shikhar, Casale, Giuliano, Jennings, Nicholas R.

In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models i

Externí odkaz: http://arxiv.org/abs/2110.02912

Zobrazit plný text záznamu

Report

Are Convolutional Neural Networks or Transformers more like human vision?

Autor: Tuli, Shikhar, Dasgupta, Ishita, Grant, Erin, Griffiths, Thomas L.

Modern machine learning models for computer vision exceed humans in accuracy on specific visual recognition tasks, notably on datasets like ImageNet. However, high accuracy can be achieved in many ways. The particular decision function found by a mac

Externí odkaz: http://arxiv.org/abs/2105.07197

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání