Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Girish, Adway"'
Autor:
Girish, Adway, Nagle, Alliot, Bondaschi, Marco, Gastpar, Michael, Makkuva, Ashok Vardhan, Kim, Hyeji
We formalize the problem of prompt compression for large language models (LLMs) and present a framework to unify token-level prompt compression methods which create hard prompts for black-box models. We derive the distortion-rate function for this se
Externí odkaz:
http://arxiv.org/abs/2407.15504
Autor:
Makkuva, Ashok Vardhan, Bondaschi, Marco, Ekbote, Chanakya, Girish, Adway, Nagle, Alliot, Kim, Hyeji, Gastpar, Michael
In recent years, transformer-based models have revolutionized deep learning, particularly in sequence modeling. To better understand this phenomenon, there is a growing interest in using Markov input processes to study transformers. However, our curr
Externí odkaz:
http://arxiv.org/abs/2406.03072
Autor:
Makkuva, Ashok Vardhan, Bondaschi, Marco, Girish, Adway, Nagle, Alliot, Jaggi, Martin, Kim, Hyeji, Gastpar, Michael
In recent years, attention-based transformers have achieved tremendous success across a variety of disciplines including natural languages. A key ingredient behind their success is the generative pretraining procedure, during which these models are t
Externí odkaz:
http://arxiv.org/abs/2402.04161
We study the problem of best-arm identification in a distributed variant of the multi-armed bandit setting, with a central learner and multiple agents. Each agent is associated with an arm of the bandit, generating stochastic rewards following an unk
Externí odkaz:
http://arxiv.org/abs/2305.00528