A Novel Approach for Semantic Extractive Text Summarization

Autor: Waseemullah, Zainab Fatima, Shehnila Zardari, Muhammad Fahim, Maria Andleeb Siddiqui, Ag. Asri Ag. Ibrahim, Kashif Nisar, Laviza Falak Naz
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Applied Sciences, Vol 12, Iss 9, p 4479 (2022)
Druh dokumentu: article
ISSN: 2076-3417
DOI: 10.3390/app12094479
Popis: Text summarization is a technique for shortening down or exacting a long text or document. It becomes critical when someone needs a quick and accurate summary of very long content. Manual text summarization can be expensive and time-consuming. While summarizing, some important content, such as information, concepts, and features of the document, can be lost; therefore, the retention ratio, which contains informative sentences, is lost, and if more information is added, then lengthy texts can be produced, increasing the compression ratio. Therefore, there is a tradeoff between two ratios (compression and retention). The model preserves or collects all the informative sentences by taking only the long sentences and removing the short sentences with less of a compression ratio. It tries to balance the retention ratio by avoiding text redundancies and also filters irrelevant information from the text by removing outliers. It generates sentences in chronological order as the sentences are mentioned in the original document. It also uses a heuristic approach for selecting the best cluster or group, which contains more meaningful sentences that are present in the topmost sentences of the summary. Our proposed model extractive summarizer overcomes these deficiencies and tries to balance between compression and retention ratios.
Databáze: Directory of Open Access Journals