TAXONOMY ENRICHMENT FOR RUSSIAN: SYNSET CLASSIFICATION OUTPERFORMS LINEAR HYPONYM-HYPERNYM PROJECTIONS

Autor: A. Plum, Maria Kunilovskaya, Andrey Kutuzov
Rok vydání: 2020
Předmět:
Zdroj: Computational Linguistics and Intellectual Technologies.
ISSN: 2075-7182
DOI: 10.28995/2075-7182-2020-19-474-484
Popis: We present the description of our system that was ranked third in the noun sub-track of the Taxonomy Enrichment for the Russian Language shared task offered by Dialogue Evaluation 2020. Our best-performing system appears against the backdrop of other methods and their combinations attempted, and its results argue in favour of Occam’s razor for this task. A simple supervised classifier was trained on static distributional embeddings of hyponym words as features and their numeric hypernym synset identifiers from the taxonomy as class labels. It outperformed more complicated approaches based on learning linear projections from hyponym embeddings to hypernym embeddings and returning synset identifiers for the nearest neighbours of the predicted vectors. Training specially tailored word embeddings for ruWordNet multi-word expressions proved to be one of the key factors for both approaches.
Databáze: OpenAIRE