Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Zuo, Chunsheng"'
Transformers with causal attention can solve tasks that require positional information without using positional encodings. In this work, we propose and investigate a new hypothesis about how positional information can be stored without using explicit
Externí odkaz:
http://arxiv.org/abs/2501.00073
Autor:
Zuo, Chunsheng, Guerzhoy, Michael
As we show in this paper, the prediction for output token $n+1$ of Transformer architectures without one of the mechanisms of positional encodings and causal attention is invariant to permutations of input tokens $1, 2, ..., n-1$. Usually, both mecha
Externí odkaz:
http://arxiv.org/abs/2402.05969