Výsledky vyhledávání

Report

FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression

Autor: Mittu, Fazal, Bu, Yihuan, Gupta, Akshat, Devireddy, Ashok, Ozdarendeli, Alp Eren, Singh, Anant, Anumanchipalli, Gopala

While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural network a

Externí odkaz: http://arxiv.org/abs/2409.17141

Zobrazit plný text záznamu

Report

Assessing FIFO and Round Robin Scheduling:Effects on Data Pipeline Performance and Energy Usage

Autor: Choudhury, Malobika Roy, Mehrotra, Akshat

In the case of compute-intensive machine learning, efficient operating system scheduling is crucial for performance and energy efficiency. This paper conducts a comparative study over FIFO(First-In-First-Out) and RR(Round-Robin) scheduling policies w

Externí odkaz: http://arxiv.org/abs/2409.15704

Zobrazit plný text záznamu

Report

Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm

Autor: Gupta, Akshat, Ozdemir, Atahan, Anumanchipalli, Gopala

Layer normalization is a pivotal step in the transformer architecture. This paper delves into the less explored geometric implications of this process, examining how LayerNorm influences the norm and orientation of hidden vectors in the representatio

Externí odkaz: http://arxiv.org/abs/2409.12951

Zobrazit plný text záznamu

Report

Brunn-Minkowski type estimates for certain discrete sumsets

Autor: Bruch, Albert Lopez, Jing, Yifan, Mudgal, Akshat

Let $d,k$ be natural numbers and let $\mathcal{L}_1, \dots, \mathcal{L}_k \in \mathrm{GL}_d(\mathbb{Q})$ be linear transformations such that there are no non-trivial subspaces $U, V \subseteq \mathbb{Q}^d$ of the same dimension satisfying $\mathcal{L

Externí odkaz: http://arxiv.org/abs/2409.05638

Zobrazit plný text záznamu

Report

Explaining 95 (or so) GeV Anomalies in the 2-Higgs Doublet Model Type-I

Autor: Khanna, Akshat, Moretti, Stefano, Sarkar, Agnivo

We show how the 2-Higgs Doublet Model (2HDM) Type-I can explain some excesses recently seen at the Large Hadron Collider (LHC) in $\gamma\gamma$ and $\tau^+\tau^-$ final states in turn matching Large Electron Positron (LEP) data in $b\bar b$ signatur

Externí odkaz: http://arxiv.org/abs/2409.02587

Zobrazit plný text záznamu

Report

CoDi: Conversational Distillation for Grounded Question Answering

Autor: Huber, Patrick, Einolghozati, Arash, Conway, Rylan, Narang, Kanika, Smith, Matt, Nayyar, Waqar, Sagar, Adithya, Aly, Ahmed, Shrivastava, Akshat

Distilling conversational skills into Small Language Models (SLMs) with approximately 1 billion parameters presents significant challenges. Firstly, SLMs have limited capacity in their model parameters to learn extensive knowledge compared to larger

Externí odkaz: http://arxiv.org/abs/2408.11219

Zobrazit plný text záznamu

Report

A Factored MDP Approach To Moving Target Defense With Dynamic Threat Modeling and Cost Efficiency

Autor: Bose, Megha, Paruchuri, Praveen, Kumar, Akshat

Moving Target Defense (MTD) has emerged as a proactive and dynamic framework to counteract evolving cyber threats. Traditional MTD approaches often rely on assumptions about the attackers knowledge and behavior. However, real-world scenarios are inhe

Externí odkaz: http://arxiv.org/abs/2408.08934

Zobrazit plný text záznamu

Report

Power Aware Container Placement in Cloud Computing with Affinity and Cubic Power Model

Autor: Sarkar, Suvarthi, Sharma, Nandini, Mittal, Akshat, Sahu, Aryabartta

Modern data centres are increasingly adopting containers to enhance power and performance efficiency. These data centres consist of multiple heterogeneous machines, each equipped with varying amounts of resources such as CPU, I/O, memory, and network

Externí odkaz: http://arxiv.org/abs/2408.01176

Zobrazit plný text záznamu

Report

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Autor: Lin, Xi Victoria, Shrivastava, Akshat, Luo, Liang, Iyer, Srinivasan, Lewis, Mike, Ghosh, Gargi, Zettlemoyer, Luke, Aghajanyan, Armen

We introduce MoMa, a novel modality-aware mixture-of-experts (MoE) architecture designed for pre-training mixed-modal, early-fusion language models. MoMa processes images and text in arbitrary sequences by dividing expert modules into modality-specif

Externí odkaz: http://arxiv.org/abs/2407.21770

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání