Automatic Text Summarization Based on Semantic Networks and Corpus Statistics

Autor: Winda Yulita, Sigit Priyanta, Azhari SN
Jazyk: English<br />Indonesian
Rok vydání: 2019
Předmět:
Zdroj: IJCCS (Indonesian Journal of Computing and Cybernetics Systems), Vol 13, Iss 2, Pp 137-148 (2019)
Druh dokumentu: article
ISSN: 1978-1520
2460-7258
DOI: 10.22146/ijccs.38261
Popis: One simple automatic text summarization method that can minimize redundancy, in summary, is the Maximum Marginal Relevance (MMR) method. The MMR method has the disadvantage of having parts that are separated from each other in summary results that are not semantically connected. Therefore, this study aims to compare summary results using the MMR method based on semantic and non-semantic based MMR. Semantic-based MMR methods utilize WordNet Bahasa and corpus in processing text summaries. The MMR method is non-semantic based on the TF-IDF method. This study also carried out summary compression of 30%, 20%, and 10%. The research data used is 50 online news texts. Testing of the summary text results is done using the ROUGE toolkit. The results of the study state that the best value of the f-score in the semantic-based MMR method is 0.561, while the best f-score in the non-semantic MMR method is 0.598. This value is generated by adding a preprocessing process in the form of stemming and compression of a 30% summary result. The difference in value obtained is due to incomplete WordNet Bahasa and there are several words in the news title that are not in accordance with EYD (KBBI).
Databáze: Directory of Open Access Journals