HEURISTICS-BASED METHOD FOR HEAD AND MODIFIER DETECTION IN MALAY SENTENCES FROM THE CULTURAL HERITAGE DOMAIN

Autor: Suhaimi Ab Rahman, Nazlia Omar
Jazyk: English<br />Malay
Rok vydání: 2017
Předmět:
Zdroj: Asia-Pacific Journal of Information Technology and Multimedia, Vol 6, Iss 01, Pp 13-21 (2017)
Druh dokumentu: article
ISSN: 2289-2192
86761625
DOI: 10.17576/apjitm-2017-0601-02
Popis: The process of detection for the head and modifier in Malay sentences from the cultural heritage domain is difficult to identify. This is due to the position of head and modifier which varies in sentences depending on the sentence structures. Hence, there are different point of views about the theory and concept of detection for the head and modifier in a compound noun that have been discussed by language experts. Additionally, the existing research is also limited especially in the areas of computational linguistics. Therefore, research should be conducted to identify appropriate methods especially used in the detection of head and modifier which appear in Malay setences from the cultural heritage domain. The aim of this study is to construct a list of heuristic rules to be used for detecting the position of compound nouns in Malay sentences from cultural heritage domain. By using 15 rules, the position of head and modifier that exist in a compound noun can also be detected. These rules are called heuristic rules. The purpose of formulating these 15 rules is to detect the head and modifier that exist in the Malay sentences from the cultural heritage domain. To measure the accuracy of the results, precision, recall and F1-score values are used. Based on the results of the experiments, Sentence Structure of Malay Cultural Heritage Domain (SADWBM) have an F1-score of 80.4% compared to Noun Phrase Structure (SFN) which is 56%. Consequently, SADWBM shows better scores compared to SFN. Therefore it is clear that the approach used in this study is effective in resolving the identified problems.
Databáze: Directory of Open Access Journals