Extractive summarization using siamese hierarchical transformer encoders

Autor: Lluís F. Hurtado, Emilio Sanchis, Fernando García-Granada, Encarna Segarra, José-Ángel González
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
instname
Popis: [EN] In this paper, we present an extractive approach to document summarization, the Siamese Hierarchical Transformer Encoders system, that is based on the use of siamese neural networks and the transformer encoders which are extended in a hierarchical way. The system, trained for binary classification, is able to assign attention scores to each sentence in the document. These scores are used to select the most relevant sentences to build the summary. The main novelty of our proposal is the use of self-attention mechanisms at sentence level for document summarization, instead of using only attentions at word level. The experimentation carried out using the CNN/DailyMail summarization corpus shows promising results in-line with the state-of-the-art.
This work has been partially supported by the Spanish MINECO and FEDER founds under project AMIC (TIN2017-85854-C4-2-R). Work of Jose Angel Gonzalez is also financed by Universitat Politecnica de Valencia under grant PAID-01-17.
Databáze: OpenAIRE