Legal public opinion news abstractive summarization by incorporating topic information
Autor: | Yantuan Xian, Yu Zhiqiang, Guo Junjun, Huang Yuxin, Zhengtao Yu |
---|---|
Rok vydání: | 2020 |
Předmět: |
Information retrieval
business.industry Computer science 05 social sciences Computational intelligence 010501 environmental sciences Public opinion 01 natural sciences Automatic summarization Task (project management) Domain (software engineering) Artificial Intelligence 0502 economics and business Pattern recognition (psychology) Selection (linguistics) Domain knowledge Computer Vision and Pattern Recognition 050207 economics business Software 0105 earth and related environmental sciences |
Zdroj: | International Journal of Machine Learning and Cybernetics. 11:2039-2050 |
ISSN: | 1868-808X 1868-8071 |
DOI: | 10.1007/s13042-020-01093-8 |
Popis: | Automatically generate accurate summaries from legal public opinion news can help readers to grasp the main ideas of news quickly. Although many improved sequence-to-sequence models have been proposed for the abstractive text summarization task, these approaches confront two challenges when addressing domain-specific summarization task: (1) the appropriate selection of domain knowledge; (2) the effective manner of integrating domain knowledge into summarization model. In order to tackle the above challenges, this paper selects the pre-training topic information as the legal domain knowledge, which is then integrated into the sequence-to-sequence model to improve the performance of public opinion news summarization. Concretely, two kinds of topic information are utilized: first, the topic words which denote the main aspects of the source document are encoded to guide the decoding process. Furthermore, the predicted output is forced to have a similar topic probability distribution with the source document. We evaluate our model on a large dataset of legal public opinion news collected from micro-blog, and the experimental results show that the proposed model outperforms existing baseline systems under the rouge metrics. To the best of our knowledge, this work represents the first attempt in the legal public opinion domain for text summarization task. |
Databáze: | OpenAIRE |
Externí odkaz: |