Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Rodkin, Ivan"'
This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step. Our approach, Associative Recurrent Memory Transformer (ARMT), is based on tran
Externí odkaz:
http://arxiv.org/abs/2407.04841
Autor:
Kuratov, Yuri, Bulatov, Aydar, Anokhin, Petr, Rodkin, Ivan, Sorokin, Dmitry, Sorokin, Artyom, Burtsev, Mikhail
In recent years, the input context sizes of large language models (LLMs) have increased dramatically. However, existing evaluation methods have not kept pace, failing to comprehensively assess the efficiency of models in handling long contexts. To br
Externí odkaz:
http://arxiv.org/abs/2406.10149
Publikováno v:
In Procedia Computer Science 2022 213:570-579