LNLF-BERT: Transformer for Long Document Classification With Multiple Attention Levels

Autor: Linh Manh Pham, Hoang Cao the
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 165348-165358 (2024)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3492102
Popis: Transformer-based models, such as Bidirectional Encoder Representations from Transformers (BERT), cannot process long sequences because their self-attention operation scales quadratically with the sequence length. To remedy this, we introduce the Look Near and Look Far BERT (LNLF-BERT) with a two-level self-attention mechanism at the sentence and document levels, which can handle document classifications with thousands of tokens. The self-attention mechanism of LNLF-BERT retains some of the benefits of full self-attention at each level while reducing the complexity of not using full self-attention on the whole document. Our theoretical analysis shows that the LNLF-BERT mechanism is an approximator of the full self-attention model. We pretrain the LNLF-BERT from scratch and fine-tune it on downstream tasks. The experiments were also conducted to demonstrate the feasibility of LNLF-BERT in long text processing. Moreover, LNLF-BERT effectively balances local and global attention, allowing for efficient document-level understanding. Compared to other long-sequence models like Longformer and BigBird, LNLF-BERT shows competitive performance in both accuracy and computational efficiency. The architecture is scalable to various downstream tasks, making it adaptable for different applications in natural language processing.
Databáze: Directory of Open Access Journals