Zobrazeno 1 - 10
of 22 487
pro vyhledávání: '"A. Moin"'
Autor:
Chegini, Atoosa, Kazemi, Hamid, Mirzadeh, Iman, Yin, Dong, Horton, Maxwell, Nabi, Moin, Farajtabar, Mehrdad, Alizadeh, Keivan
In Large Language Model (LLM) development, Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning models with human values and preferences. RLHF traditionally relies on the Kullback-Leibler (KL) divergence between the current polic
Externí odkaz:
http://arxiv.org/abs/2411.01798
Autor:
Ashkboos, Saleh, Mirzadeh, Iman, Alizadeh, Keivan, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad, Faghri, Fartash
While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and computatio
Externí odkaz:
http://arxiv.org/abs/2410.19456
Autor:
Shabram, Megan, McClelland, Ryan, Wu, John, Venkataram, Hamsa Shwetha, Segars, Heidi, Dean, Bruce, Ye, Christine, Moin, Aquib, Ansdell, Megan, Moussa, Mark, Rebbapragada, Umaa, Valizadegan, Hamed, Perini, Dominick, Ko, Glenn, Da Poian, Victoria, Gharib-Nezhad, Sam, Cataldo, Giuseppe
Here we present several use cases for using Generative AI (Gen AI) to improve systems engineering and cognitive knowledge management related to the future of astronomy from a culmination of working meetings and presentations as part of the Gen AI Tas
Externí odkaz:
http://arxiv.org/abs/2410.16609
Autor:
Fang, Ching, Sandino, Christopher, Mahasseni, Behrooz, Minxha, Juri, Pouransari, Hadi, Azemi, Erdrin, Moin, Ali, Zippi, Ellen
Many healthcare applications are inherently multimodal, involving several physiological signals. As sensors for these signals become more common, improving machine learning methods for multimodal healthcare data is crucial. Pretraining foundation mod
Externí odkaz:
http://arxiv.org/abs/2410.16424
Autor:
Agrawal-Chung, Navin, Moin, Zohran
Landmine detection using traditional methods is slow, dangerous and prohibitively expensive. Using deep learning-based object detection algorithms drone videos is promising but has multiple challenges due to the small, soda-can size of recently preva
Externí odkaz:
http://arxiv.org/abs/2410.19807
Autor:
Liu, Ran, Ma, Wenrui, Zippi, Ellen, Pouransari, Hadi, Xiao, Jingyun, Sandino, Chris, Mahasseni, Behrooz, Minxha, Juri, Azemi, Erdrin, Dyer, Eva L., Moin, Ali
Time series data are inherently functions of time, yet current transformers often learn time series by modeling them as mere concatenations of time periods, overlooking their functional properties. In this work, we propose a novel objective for trans
Externí odkaz:
http://arxiv.org/abs/2410.08421
Autor:
Horton, Maxwell, Cao, Qingqing, Sun, Chenfan, Jin, Yanzi, Mehta, Sachin, Rastegari, Mohammad, Nabi, Moin
Inference with transformer-based language models begins with a prompt processing step. In this step, the model generates the first output token and stores the KV cache needed for future generation steps. This prompt processing step can be computation
Externí odkaz:
http://arxiv.org/abs/2410.08391
Modern vision models have achieved remarkable success in benchmarks where local features provide critical information about the target. There is now a growing interest in solving tasks that require more global reasoning, where local features offer no
Externí odkaz:
http://arxiv.org/abs/2410.08165
Autor:
Patel, Gaurav, Sandino, Christopher, Mahasseni, Behrooz, Zippi, Ellen L, Azemi, Erdrin, Moin, Ali, Minxha, Juri
In this paper, we propose a framework for efficient Source-Free Domain Adaptation (SFDA) in the context of time-series, focusing on enhancing both parameter efficiency and data-sample utilization. Our approach introduces an improved paradigm for sour
Externí odkaz:
http://arxiv.org/abs/2410.02147
Autor:
Alizadeh, Keivan, Mirzadeh, Iman, Shahrokhi, Hooman, Belenko, Dmitry, Sun, Frank, Cho, Minsik, Sekhavat, Mohammad Hossein, Nabi, Moin, Farajtabar, Mehrdad
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization. To address this shortcoming, recent advancements in mixture of expert (MoE) models, speculative decoding
Externí odkaz:
http://arxiv.org/abs/2410.10846