Idiomatic Expression Identification using Semantic Compatibility

Autor: Ziheng Zeng, Suma Bhat
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1546-1562 (2021)
Druh dokumentu: article
ISSN: 2307-387X
DOI: 10.1162/tacl_a_00442/108933/Idiomatic-Expression-Identification-using-Semantic
Popis: AbstractIdiomatic expressions are an integral part of natural language and constantly being added to a language. Owing to their non-compositionality and their ability to take on a figurative or literal meaning depending on the sentential context, they have been a classical challenge for NLP systems. To address this challenge, we study the task of detecting whether a sentence has an idiomatic expression and localizing it when it occurs in a figurative sense. Prior research for this task has studied specific classes of idiomatic expressions offering limited views of their generalizability to new idioms. We propose a multi-stage neural architecture with attention flow as a solution. The network effectively fuses contextual and lexical information at different levels using word and sub-word representations. Empirical evaluations on three of the largest benchmark datasets with idiomatic expressions of varied syntactic patterns and degrees of non-compositionality show that our proposed model achieves new state-of-the-art results. A salient feature of the model is its ability to identify idioms unseen during training with gains from 1.4% to 30.8% over competitive baselines on the largest dataset.
Databáze: Directory of Open Access Journals