Multiprocessing Stemming: A Case Study of Indonesian Stemming

Autor: Mastura Diana Marieska, Novi Yusliani, Rifkie Primartha
Rok vydání: 2019
Předmět:
Zdroj: International Journal of Computer Applications. 182:15-19
ISSN: 0975-8887
DOI: 10.5120/ijca2019918476
Popis: Research in the field of Natural Language Processing (NLP) is currently increasing especially with the arrival of a new term that is "big data". The needs of the programming library that ready-touse becomes very important to speed up the phases of research. Some libraries that have already been mature is available but generally for English language and its dependently. So, it can't be used for other languages. Stemming is one of the basic processes that exist in NLP.Indonesian stemming algorithm that often used is ECS (Enhanced Confix Stripping). One of the libraries that already implemented the algorithm is Sastrawi. Results from the experiment show that the time of stemming processing by Sastrawi is still slow. Therefore, this research will optimize the speed of stemming processing using multiprocessing (MP). The data test are used in this research has manually taken form Wikipedia.The experiment results show that the MP technique can decrease the average time of stemming processing about 98.45%.
Databáze: OpenAIRE