Zobrazeno 1 - 10
of 399
pro vyhledávání: '"Rostamizadeh P"'
Autor:
Rawat, Ankit Singh, Sadhanala, Veeranjaneyulu, Rostamizadeh, Afshin, Chakrabarti, Ayan, Jitkrittum, Wittawat, Feinberg, Vladimir, Kim, Seungyeon, Harutyunyan, Hrayr, Saunshi, Nikunj, Nado, Zachary, Shivanna, Rakesh, Reddi, Sashank J., Menon, Aditya Krishna, Anil, Rohan, Kumar, Sanjiv
A primary challenge in large language model (LLM) development is their onerous pre-training cost. Typically, such pre-training involves optimizing a self-supervised objective (such as next-token prediction) over a large corpus. This paper explores a
Externí odkaz:
http://arxiv.org/abs/2410.18779
We present a novel soft prompt based framework, SoftSRV, that leverages a frozen pre-trained large language model (LLM) to generate targeted synthetic text sequences. Given a sample from the target distribution, our proposed framework uses data-drive
Externí odkaz:
http://arxiv.org/abs/2410.16534
Autor:
Ye, Ke, Jiang, Heinrich, Rostamizadeh, Afshin, Chakrabarti, Ayan, DeSalvo, Giulia, Kagy, Jean-François, Karydas, Lazaros, Citovsky, Gui, Kumar, Sanjiv
Pre-training large language models is known to be extremely resource intensive and often times inefficient, under-utilizing the information encapsulated in the training text sequences. In this paper, we present SpacTor, a new training procedure consi
Externí odkaz:
http://arxiv.org/abs/2401.13160
Autor:
Zhou, Yongchao, Lyu, Kaifeng, Rawat, Ankit Singh, Menon, Aditya Krishna, Rostamizadeh, Afshin, Kumar, Sanjiv, Kagy, Jean-François, Agarwal, Rishabh
Speculative decoding (SD) accelerates large language model inference by employing a faster draft model for generating multiple tokens, which are then verified in parallel by the larger target model, resulting in the text generated according to the ta
Externí odkaz:
http://arxiv.org/abs/2310.08461
Autor:
Citovsky, Gui, DeSalvo, Giulia, Kumar, Sanjiv, Ramalingam, Srikumar, Rostamizadeh, Afshin, Wang, Yunjuan
We present a subset selection algorithm designed to work with arbitrary model families in a practical batch setting. In such a setting, an algorithm can sample examples one at a time but, in order to limit overhead costs, is only able to update its s
Externí odkaz:
http://arxiv.org/abs/2301.12052
Given a labeled training set and a collection of unlabeled data, the goal of active learning (AL) is to identify the best unlabeled points to label. In this comprehensive study, we analyze the performance of a variety of AL algorithms on deep neural
Externí odkaz:
http://arxiv.org/abs/2210.03822
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Citovsky, Gui, DeSalvo, Giulia, Gentile, Claudio, Karydas, Lazaros, Rajagopalan, Anand, Rostamizadeh, Afshin, Kumar, Sanjiv
The ability to train complex and highly effective models often requires an abundance of training data, which can easily become a bottleneck in cost, time, and computational resources. Batch active learning, which adaptively issues batched queries to
Externí odkaz:
http://arxiv.org/abs/2107.14263
Publikováno v:
ICLR 2022
In real-world systems, models are frequently updated as more data becomes available, and in addition to achieving high accuracy, the goal is to also maintain a low difference in predictions compared to the base model (i.e. predictive "churn"). If mod
Externí odkaz:
http://arxiv.org/abs/2106.02654
Autor:
Jiang, Heinrich, Rostamizadeh, Afshin
We analyze the problem of active covering, where the learner is given an unlabeled dataset and can sequentially label query examples. The objective is to label query all of the positive examples in the fewest number of total label queries. We show un
Externí odkaz:
http://arxiv.org/abs/2106.02552