Attention based Abstractive Summarization of Malayalam Document

Autor: David Peter S, Sindhya K Nambiar, Sumam Mary Idicula
Rok vydání: 2021
Předmět:
Zdroj: Procedia Computer Science. 189:250-257
ISSN: 1877-0509
DOI: 10.1016/j.procs.2021.05.088
Popis: There are different textual content summarization processes available in natural Language Processing. Amongst them abstractive textual content summarization is one of the challenging problems in natural language processing and that too, with very little research done in regional languages. Unlike other summarization techniques, which reuses the words and phrases from the source text, abstractive text summarization builds a short and concise precis of a huge text document built from the underlying message of the text not necessarily using the same words and phrases from the source. The objective of the proposed work is to create a brief and understandable abstractive summary of a Malayalam document. Malayalam is one of the 22 scheduled languages of India spoken by over 34 million people and is designated as a Classical Language in India. Being a Classical language, Malayalam has a very unique syntactic and semantic rules which makes this work more important. The proposed work attempts to create an attention mechanism to generate the summary of the source document. In this work, the goal was to compare the efficiency of Attention model with sequence to sequence baseline model of Malayalam text and thereby implementing a better abstractive text summarizer for a malayalam document.
Databáze: OpenAIRE