Feature Selection using k-Medoid Algorithm for Categorization of Hadith Translation in English

Autor: Firda Ayu Setiawati, Q. U. Safitri, Arief Fatchul Huda, Wahyudin Darmalaksana, Aep Saepulloh
Rok vydání: 2019
Předmět:
Zdroj: 2019 IEEE 5th International Conference on Wireless and Telematics (ICWT).
DOI: 10.1109/icwt47785.2019.8978221
Popis: The problem of document classification is the number of features that are very large. the number of features depends on the number of terms or vocabulary used. Obviously, for every document, it contains only a small number of words in a vocabulary. So that will cause the number of elements zero. Therefore, a method is proposed to select some features that can represent all features. the method used is to cluster the vocabulary. representatives of each cluster of clustered results are used as a feature for each document in the categorization process. the categorization process is done by the k-Neirest Neighbor (k-NN) and Nearest Centroid (NC) algorithms. The data used is the translation of English hadith. with this method, it is expected that computation time will be faster and categorization result will be better (accuracy more precise).
Databáze: OpenAIRE