Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD
Autor: | Nasrin Taghizadeh, Hesham Faili |
---|---|
Rok vydání: | 2016 |
Předmět: |
Low resource
Computer science InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL WordNet 02 engineering and technology computer.software_genre eXtended WordNet ComputingMethodologies_ARTIFICIALINTELLIGENCE Resource (project management) Development (topology) Artificial Intelligence 020204 information systems 0202 electrical engineering electronic engineering information engineering Semantic memory Persian Information retrieval business.industry InformationSystems_INFORMATIONSYSTEMSAPPLICATIONS language.human_language ComputingMethodologies_PATTERNRECOGNITION language 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing Meaning (linguistics) |
Zdroj: | Journal of Artificial Intelligence Research. 56:61-87 |
ISSN: | 1076-9757 |
DOI: | 10.1613/jair.4968 |
Popis: | Wordnets are an effective resource for natural language processing and information retrieval, especially for semantic processing and meaning related tasks. So far, wordnets have been constructed for many languages. However, the automatic development of wordnets for low-resource languages has not been well studied. In this paper, an Expectation-Maximization algorithm is used to create high quality and large scale wordnets for poor-resource languages. The proposed method benefits from possessing cross-lingual word sense disambiguation and develops a wordnet by only using a bi-lingual dictionary and a mono-lingual corpus. The proposed method has been executed with Persian language and the resulting wordnet has been evaluated through several experiments. The results show that the induced wordnet has a precision score of 90% and a recall score of 35%. |
Databáze: | OpenAIRE |
Externí odkaz: |