Autor: |
Ostmeier, Sophie, Axelrod, Brian, Moseley, Michael E., Chaudhari, Akshay, Langlotz, Curtis |
Rok vydání: |
2024 |
Předmět: |
|
Druh dokumentu: |
Working Paper |
Popis: |
While Rotary Position Embeddings (RoPE) for large language models have become widely adopted, their application for other modalities has been slower. Here, we introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs. We evaluate the performance of LieRE on 2D and 3D image classification tasks and observe that LieRE leads to marked relative improvements in performance (up to 9.7% for 2D and up to 25.5% for 3D), training efficiency (3.5x reduction), data efficiency (30%) compared to the baselines of DeiT III, RoPE-Mixed and Vision-Llama. https://github.com/Stanford-AIMI/LieRE |
Databáze: |
arXiv |
Externí odkaz: |
|