Decision Tree based Supervised Word Sense Disambiguation for Assamese
Autor: | Jumi Sarmah, Shikhar Kr. Sarma |
---|---|
Rok vydání: | 2016 |
Předmět: |
Incremental decision tree
Computer science business.industry Decision tree learning Decision tree Context (language use) 02 engineering and technology computer.software_genre Machine learning language.human_language 030507 speech-language pathology & audiology 03 medical and health sciences C4.5 algorithm Tree structure 020204 information systems 0202 electrical engineering electronic engineering information engineering Assamese language Artificial intelligence 0305 other medical science business computer Decision model Natural language processing |
Zdroj: | International Journal of Computer Applications. 141:42-48 |
ISSN: | 0975-8887 |
DOI: | 10.5120/ijca2016909488 |
Popis: | Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow- chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words. |
Databáze: | OpenAIRE |
Externí odkaz: |