A Condense-then-Select Strategy for Text Summarization
Autor: | Irwin King, Hou Pong Chan |
---|---|
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Information Systems and Management Computer science Context (language use) 02 engineering and technology Data_CODINGANDINFORMATIONTHEORY computer.software_genre Management Information Systems Extractor Artificial Intelligence 020204 information systems Compression (functional analysis) 0202 electrical engineering electronic engineering information engineering Computer Science - Computation and Language business.industry Automatic summarization Salient ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 020201 artificial intelligence & image processing Artificial intelligence business computer Computation and Language (cs.CL) Software Sentence Natural language processing |
DOI: | 10.48550/arxiv.2106.10468 |
Popis: | Select-then-compress is a popular hybrid, framework for text summarization due to its high efficiency. This framework first selects salient sentences and then independently condenses each of the selected sentences into a concise version. However, compressing sentences separately ignores the context information of the document, and is therefore prone to delete salient information. To address this limitation, we propose a novel condense-then-select framework for text summarization. Our framework first concurrently condenses each document sentence. Original document sentences and their compressed versions then become the candidates for extraction. Finally, an extractor utilizes the context information of the document to select candidates and assembles them into a summary. If salient information is deleted during condensing, the extractor can select an original sentence to retain the information. Thus, our framework helps to avoid the loss of salient information, while preserving the high efficiency of sentence-level compression. Experiment results on the CNN/DailyMail, DUC-2002, and Pubmed datasets demonstrate that our framework outperforms the select-then-compress framework and other strong baselines. Comment: Accepted by Knowledge-Based Systems (KBS) journal |
Databáze: | OpenAIRE |
Externí odkaz: |