Decision Tree based Supervised Word Sense Disambiguation for Assamese

Autor: Jumi Sarmah, Shikhar Kr. Sarma
Rok vydání: 2016
Předmět:
Zdroj: International Journal of Computer Applications. 141:42-48
ISSN: 0975-8887
DOI: 10.5120/ijca2016909488
Popis: Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow- chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words.
Databáze: OpenAIRE