Zobrazeno 1 - 10
of 82
pro vyhledávání: '"Nabi, Moin"'
Autor:
Ashkboos, Saleh, Mirzadeh, Iman, Alizadeh, Keivan, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad, Faghri, Fartash
While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and computatio
Externí odkaz:
http://arxiv.org/abs/2410.19456
Autor:
Horton, Maxwell, Cao, Qingqing, Sun, Chenfan, Jin, Yanzi, Mehta, Sachin, Rastegari, Mohammad, Nabi, Moin
Inference with transformer-based language models begins with a prompt processing step. In this step, the model generates the first output token and stores the KV cache needed for future generation steps. This prompt processing step can be computation
Externí odkaz:
http://arxiv.org/abs/2410.08391
Modern vision models have achieved remarkable success in benchmarks where local features provide critical information about the target. There is now a growing interest in solving tasks that require more global reasoning, where local features offer no
Externí odkaz:
http://arxiv.org/abs/2410.08165
Autor:
Alizadeh, Keivan, Mirzadeh, Iman, Shahrokhi, Hooman, Belenko, Dmitry, Sun, Frank, Cho, Minsik, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization. To address this shortcoming, recent advancements in mixture of expert (MoE) models, speculative decoding
Externí odkaz:
http://arxiv.org/abs/2410.10846
Autor:
Samragh, Mohammad, Mirzadeh, Iman, Vahid, Keivan Alizadeh, Faghri, Fartash, Cho, Minsik, Nabi, Moin, Naik, Devang, Farajtabar, Mehrdad
The pre-training phase of language models often begins with randomly initialized parameters. With the current trends in scaling models, training their large number of parameters can be extremely slow and costly. In contrast, small language models are
Externí odkaz:
http://arxiv.org/abs/2409.12903
Autor:
Klein, Tassilo, Nabi, Moin
The generation of undesirable and factually incorrect content of large language models poses a significant challenge and remains largely an unsolved issue. This paper studies the integration of a contrastive learning objective for fine-tuning LLMs fo
Externí odkaz:
http://arxiv.org/abs/2401.08491
Autor:
Fini, Enrico, Astolfi, Pietro, Alahari, Karteek, Alameda-Pineda, Xavier, Mairal, Julien, Nabi, Moin, Ricci, Elisa
Publikováno v:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023) 3187-3197
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially available, motivating a recent line of work on semi-supervised metho
Externí odkaz:
http://arxiv.org/abs/2306.07483
Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised le
Externí odkaz:
http://arxiv.org/abs/2212.05102
Autor:
Klein, Tassilo, Nabi, Moin
This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different v
Externí odkaz:
http://arxiv.org/abs/2211.04928
Machine learning systems are often deployed in domains that entail data from multiple modalities, for example, phenotypic and genotypic characteristics describe patients in healthcare. Previous works have developed multimodal variational autoencoders
Externí odkaz:
http://arxiv.org/abs/2204.05229