Integrating Non-Fourier and AST-Structural Relative Position Representations Into Transformer-Based Model for Source Code Summarization

Autor:	Hsiang-Mei Liang, Chin-Yu Huang
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Abstract syntax tree deep learning machine translation natural language program comprehension positional encoding Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 12, Pp 9871-9889 (2024)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2024.3354390
Popis:	Source code summaries play a crucial role in helping programmers comprehend the behavior of source code functions. In recent deep-learning based approaches for Source Code Summarization, there has been a growing focus on Transformer-based models. These models use self-attention mechanisms to overcome the long-range dependency issue that previous models often encounter, making them a promising solution for the Source Code Summarization task. However, these models suffer from two shortcomings: 1) they are weak in handling the semantics of keywords, and 2) they are weak to learn the source code with complex structure. To resolve these shortcomings, our study proposes integrating Non-Fourier and ASTStructural relative position representations into Transformer-based model for Source Code Summarization, which we have named NFASRPR-TRANS. NFASRPR-TRANS employs two types of positional encoding schemes in two different Transformer encoders. The first encoder handles the semantics of the keywords of the input source code sequence by using the Gaussian Embedder to encode the non-Fourier relative position representation of the sequence. The second encoder uses Tree Positional Encoding to learn the structural information of the Abstract Syntax Trees (ASTs), which provides relative position information in the ASTs for generating the source code summaries. Finally,we compared NFASRPR-TRANS with previous models and evaluated its performance on the Java and Python datasets using five metrics, including BLEU, ROUGE-L, CIDEr, METEOR, and SPICE. NFASRPR-TRANS achieves 2%-10% improvements across all five metrics on both datasets.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/c24b801ad674428cafb34d78658ad3c6 Zobrazit plný text záznamu View record in DOAJ