Autor: |
Tae-Seok Lee, Hyun-Young Lee, Seung-Shik Kang |
Zdroj: |
Journal of Information Processing Systems; Jun2022, Vol. 18 Issue 3, p344-358, 15p |
Abstrakt: |
Text summarization is the task of producing a shorter version of a long document while accurately preserving the main contents of the original text. Abstractive summarization generates novel words and phrases using a language generation method through text transformation and prior-embedded word information. However, newly coined words or out-of-vocabulary words decrease the performance of automatic summarization because they are not pre-trained in the machine learning process. In this study, we demonstrated an improvement in summarization quality through the contextualized embedding of BERT with out-of-vocabulary masking. In addition, explicitly providing precise pointing and an optional copy instruction along with BERT embedding, we achieved an increased accuracy than the baseline model. The recall-based word-generation metric ROUGE- 1 score was 55.11 and the word-order-based ROUGE-L score was 39.65. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|