Generating Summaries Through Unigram and Bigram: Text Summarization

Autor: Alsharman, Nesreen, Pivkina, Inna
Zdroj: International Journal of Information Technology and Web Engineering; January 2020, Vol. 15 Issue: 1 p64-74, 11p
Abstrakt: This article describes a new method for generating extractive summaries directly via unigram and bigram extraction techniques. The methodology uses the selective part of speech tagging to extract significant unigrams and bigrams from a set of sentences. Extracted unigrams and bigrams along with other features are used to build a final summary. A new selective rule-based part of speech tagging system is developed that concentrates on the most important parts of speech for summarizations: noun, verb, and adjective. Other parts of speech such as prepositions, articles, adverbs, etc., play a lesser role in determining the meaning of sentences; therefore, they are not considered when choosing significant unigrams and bigrams. The proposed method is tested on two problem domains: citations and opinosis data sets. Results show that the proposed method performs better than Text-Rank, LexRank, and Edmundson summarization methods. The proposed method is general enough to summarize texts from any domain.
Databáze: Supplemental Index