A Comprehensive Review of Arabic Text Summarization

Autor:	Asmaa Elsaid, Ammar Mohammed, Lamiaa Fattouh Ibrahim, Mohammed M. Sakre
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Text summarization arabic natural language processing machine learning extractive text summarization abstractive text summarization and deep learning models Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 10, Pp 38012-38030 (2022)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2022.3163292
Popis:	The explosion of online and offline data has changed how we gather, evaluate, and understand data. It is frequently difficult and time-consuming to comprehend large text documents and extract crucial information from them. Text summarization techniques address the mentioned problems by compressing long texts while retaining their essential contents. These techniques rely on the fast delivery of filtered, high-quality content to their users. Due to the massive amounts of data generated by technology and various sources, automated text summarization of large-scale data is challenging. There are three types of automatic text summarization techniques: extractive, abstractive, and hybrid. Regardless of these previous techniques, the generated summaries are a long way from the summarization produced by human experts. Although Arabic is a widely spoken language that is frequently used for content sharing on the web, Arabic text summarization of Arabic content is limited and still immature because of several problems, including the Arabic language’s morphological structure, the variety of dialects, and the lack of adequate data sources. This paper reviews text summarization approaches and recent deep learning models for this approach. Additionally, it focuses on existing datasets for these approaches, which are also reviewed, along with their characteristics and limitations. The most often used metrics for summarization quality evaluation are ROUGE1, ROUGE2, ROUGE L, and Bleu. The challenges that are encountered during Arabic text summarizing methods and approaches and the solutions proposed in each approach are analyzed. Many Arabic text summarization methods have problems, such as the lack of golden tokens during testing, being out of vocabulary (OOV) words, repeating summary sentences, lack of standard systematic methodologies and architectures, and the complexity of the Arabic language. Finally, providing the required corpora, improving evaluation using semantic representations, the lack of using rouge metrics in abstractive text summarization, and using recent deep learning models to adopt them in Arabic summarization studies is an essential demand.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/3d57ae145e164d3f9b048fecba547463 Zobrazit plný text záznamu View record in DOAJ