Výsledky vyhledávání

Report

SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF

Autor: Chegini, Atoosa, Kazemi, Hamid, Mirzadeh, Iman, Yin, Dong, Horton, Maxwell, Nabi, Moin, Farajtabar, Mehrdad, Alizadeh, Keivan

In Large Language Model (LLM) development, Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning models with human values and preferences. RLHF traditionally relies on the Kullback-Leibler (KL) divergence between the current polic

Externí odkaz: http://arxiv.org/abs/2411.01798

Zobrazit plný text záznamu

Report

Computational Bottlenecks of Training Small-scale Large Language Models

Autor: Ashkboos, Saleh, Mirzadeh, Iman, Alizadeh, Keivan, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad, Faghri, Fartash

While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and computatio

Externí odkaz: http://arxiv.org/abs/2410.19456

Zobrazit plný text záznamu

Report

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Autor: Mirzadeh, Iman, Alizadeh, Keivan, Shahrokhi, Hooman, Tuzel, Oncel, Bengio, Samy, Farajtabar, Mehrdad

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level que

Externí odkaz: http://arxiv.org/abs/2410.05229

Zobrazit plný text záznamu

Report

Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models

Autor: Alizadeh, Keivan, Mirzadeh, Iman, Shahrokhi, Hooman, Belenko, Dmitry, Sun, Frank, Cho, Minsik, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad

Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization. To address this shortcoming, recent advancements in mixture of expert (MoE) models, speculative decoding

Externí odkaz: http://arxiv.org/abs/2410.10846

Zobrazit plný text záznamu

Report

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Autor: Samragh, Mohammad, Mirzadeh, Iman, Vahid, Keivan Alizadeh, Faghri, Fartash, Cho, Minsik, Nabi, Moin, Naik, Devang, Farajtabar, Mehrdad

The pre-training phase of language models often begins with randomly initialized parameters. With the current trends in scaling models, training their large number of parameters can be extremely slow and costly. In contrast, small language models are

Externí odkaz: http://arxiv.org/abs/2409.12903

Zobrazit plný text záznamu

Report

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Autor: Mehta, Sachin, Sekhavat, Mohammad Hossein, Cao, Qingqing, Horton, Maxwell, Jin, Yanzi, Sun, Chenfan, Mirzadeh, Iman, Najibi, Mahyar, Belenko, Dmitry, Zatloukal, Peter, Rastegari, Mohammad

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we releas

Externí odkaz: http://arxiv.org/abs/2404.14619

Zobrazit plný text záznamu

Report

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Autor: Alizadeh, Keivan, Mirzadeh, Iman, Belenko, Dmitry, Khatamifard, Karen, Cho, Minsik, Del Mundo, Carlo C, Rastegari, Mohammad, Farajtabar, Mehrdad

Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory requirements present challenges, especially for devices with limi

Externí odkaz: http://arxiv.org/abs/2312.11514

Zobrazit plný text záznamu

Report

ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

Autor: Mirzadeh, Iman, Alizadeh, Keivan, Mehta, Sachin, Del Mundo, Carlo C, Tuzel, Oncel, Samei, Golnoosh, Rastegari, Mohammad, Farajtabar, Mehrdad

Large Language Models (LLMs) with billions of parameters have drastically transformed AI applications. However, their demanding computation during inference has raised significant challenges for deployment on resource-constrained devices. Despite rec

Externí odkaz: http://arxiv.org/abs/2310.04564

Zobrazit plný text záznamu

Report

Dimension bounds for escape on average in homogeneous spaces

Autor: Kleinbock, Dmitry, Mirzadeh, Shahriar

Let $X = G/\Gamma$, where $G$ is a Lie group and $\Gamma$ is a uniform lattice in $G$, and let $O$ be an open subset of $X$. We give an upper estimate for the Hausdorff dimension of the set of points whose trajectories escape $O$ on average with freq

Externí odkaz: http://arxiv.org/abs/2310.00122

Zobrazit plný text záznamu

Akademický článek

Analyzing the Role of Blockchain in Realizing Islamic Banking with a Perspective on the Constitution of the Islamic Republic of Iran

Autor: Kheirollah Parvin, Vali Rostami, Nader Mirzadeh Koohshahi, Ali Allahyarifard

Publikováno v: پژوهش‌نامه حقوق اسلامی, Vol 25, Iss 3, Pp 597-628 (2024)

‌ ∴ Introduction ∴ ‌The concept of Islamic banking is grounded in principles that emphasize rights, justice, and public welfare, aiming to manifest human ethics in financial transactions. Unlike conventional banking, Islamic banking prohibits

Externí odkaz: https://doaj.org/article/832ba9e29ec74c3891a55a2a740f572a

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání