Abstrakt: |
Given the overwhelming and rapidly increasing volumes of the published biomedical literature, automatic biomedical text summarization has long been a highly important task. Recently, great advances in the performance of biomedical text summarization have been facilitated by pre-trained language models (PLMs) based on fine-tuning. However, existing summarization methods based on PLMs do not capture domain-specific knowledge. This can result in generated summaries with low coherence, including redundant sentences, or excluding important domain knowledge conveyed in the full-text document. Furthermore, the black-box nature of the transformers means that they lack explainability, i.e. it is not clear to users how and why the summary was generated. The domain-specific knowledge and explainability are crucial for the accuracy and transparency of biomedical text summarization methods. In this article, we aim to address these issues by proposing a novel domain knowledge-enhanced graph topic transformer (DORIS) for explainable biomedical text summarization. The model integrates the graph neural topic model and the domain-specific knowledge from the Unified Medical Language System (UMLS) into the transformer-based PLM, to improve the explainability and accuracy. Experimental results on four biomedical literature datasets show that our model outperforms existing state-of-the-art (SOTA) PLM-based summarization methods on biomedical extractive summarization. Furthermore, our use of graph neural topic modeling means that our model possesses the desirable property of being explainable, i.e. it is straightforward for users to understand how and why the model selects particular sentences for inclusion in the summary. The domain-specific knowledge helps our model to learn more coherent topics, to better explain the performance. |