Improvement in Automatic Classification of Persian Documents by Means of Support Vector Machine and Representative Vector

Autor:	Ezadi Hamed, Noohi Taher, Hossennejad Mihan, Jafari Ashkan
Rok vydání:	2011
Předmět:	Support vector machine Structured support vector machine business.industry Computer science language Pattern recognition Artificial intelligence business Precision and recall Word (computer architecture) language.human_language Persian
Zdroj:	Communications in Computer and Information Science ISBN: 9783642273360
DOI:	10.1007/978-3-642-27337-7_27
Popis:	Representative Vector is a kind of Vector which includes related words and the degree of their relationships. In this paper the effect of using this kind of Vector on automatic classification of Persian documents is examined. In this method, preprocessed documents, extra words as well as word stems are at first found. Next, through one of the known ways, some features are extracted for each category. Then, the Representative Vector, which is made based on the elicited features, leads to some more detailed words which are better Representatives for each category. Findings of the experiments show that Precision and Recall can be increased significantly by extra words omission and addition of few words in the Representative Vectors as well as the use of a famous classification model like Support Vector Machine (SVM).
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::fec346ea2943fe74f832ccc9c4f893c8 https://doi.org/10.1007/978-3-642-27337-7_27 Zobrazit plný text záznamu