A novel multi document summarization with document-elements augmentation for learning materials using concept based ILP and clustering methods

Autor: Sakkaravarthy Iyyappan, K., Balasundaram, S. R.
Zdroj: International Journal of Computers and Applications; February 2024, Vol. 46 Issue: 2 p78-89, 12p
Abstrakt: Multi Document Summarization (MDS) is a technique for extracting succinct summaries from groups of related documents. The usage of MDS in the e-learning context is more appealing for providing summaries for learning materials, which helps students and teachers to focus on key concepts of the learning materials. In a learning material, availability of non-textual document-elements such as figures/diagrams, plots, graphs, tables and algorithms can be seen widely, but keeping such document-elements in the summary is not possible as they are non-textual. This proposed work incorporates the text summarization approach with document-elements augmentation to the summary to provide a detailed coverage of information without exceeding the summary length constraint. The key information in the source text is identified by important phrase features and sentence features, and the summaries are generated by selecting important sentences using the Integer Linear Programming (ILP) framework while reducing the redundancy using pre-trained sentence vectors. The relationships between the summary and document-elements are identified through document-element snippet extraction and a Hierarchical Agglomerative Clustering approach. Experimental results of the proposed summary extraction and augmentation on educational dataset (EduSumm) show better performance compared to the state-of-the-art approaches.
Databáze: Supplemental Index