Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD

Autor: Nasrin Taghizadeh, Hesham Faili
Rok vydání: 2016
Předmět:
Zdroj: Journal of Artificial Intelligence Research. 56:61-87
ISSN: 1076-9757
DOI: 10.1613/jair.4968
Popis: ‎Wordnets are an effective resource for natural language processing and information retrieval‎, ‎especially for semantic processing and meaning related tasks‎. ‎So far‎, ‎wordnets have been constructed for many languages‎. ‎However‎, ‎the automatic development of wordnets for low-resource languages has not been well studied‎. ‎In this paper‎, ‎an Expectation-Maximization algorithm is used to create high quality and large scale wordnets for poor-resource languages‎. ‎The proposed method benefits from possessing cross-lingual word sense disambiguation and develops a wordnet by only using a bi-lingual dictionary and a mono-lingual corpus‎. ‎The proposed method has been executed with Persian language and the resulting wordnet has been evaluated through several experiments‎. ‎The results show that the induced wordnet has a precision score of 90% and a recall score of 35%‎.
Databáze: OpenAIRE