LNLF-BERT: Transformer for Long Document Classification With Multiple Attention Levels

Autor:	Linh Manh Pham, Hoang Cao the
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Transformer sentence processing deep learning long text sparse attention BERT Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 12, Pp 165348-165358 (2024)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2024.3492102
Popis:	Transformer-based models, such as Bidirectional Encoder Representations from Transformers (BERT), cannot process long sequences because their self-attention operation scales quadratically with the sequence length. To remedy this, we introduce the Look Near and Look Far BERT (LNLF-BERT) with a two-level self-attention mechanism at the sentence and document levels, which can handle document classifications with thousands of tokens. The self-attention mechanism of LNLF-BERT retains some of the benefits of full self-attention at each level while reducing the complexity of not using full self-attention on the whole document. Our theoretical analysis shows that the LNLF-BERT mechanism is an approximator of the full self-attention model. We pretrain the LNLF-BERT from scratch and fine-tune it on downstream tasks. The experiments were also conducted to demonstrate the feasibility of LNLF-BERT in long text processing. Moreover, LNLF-BERT effectively balances local and global attention, allowing for efficient document-level understanding. Compared to other long-sequence models like Longformer and BigBird, LNLF-BERT shows competitive performance in both accuracy and computational efficiency. The architecture is scalable to various downstream tasks, making it adaptable for different applications in natural language processing.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/d2055d99d74d4acfbcfaaeccf195bfcd Zobrazit plný text záznamu View record in DOAJ