A Multilingual Integrated Framework for Processing Lexical Collocations

Autor: Violeta Seretan
Jazyk: angličtina
Rok vydání: 2013
Předmět:
Zdroj: Computational Linguistics-Applications pp. 87-108
Computational Linguistics ISBN: 9783642343988
Computational Linguistics-Applications
Popis: Lexical collocations are typical combinations of words, such as "heavy rain", "close collaboration", or "to meet a deadline". Pervasive in language, they are a key issue for NLP systems since, as other types of multi-word expressions like idioms, they do not allow for word-by-word processing. We present a multilingual framework that lays emphasis on the accurate acquisition of collocational knowledge from corpora and its exploitation in two large-scale applications (parsing and machine translation), as well as for lexicographic support and for reading assistance. The underlying methodology departs from mainstream approaches by relying on deep parsing to cope with the high morphosyntactic flexibility of collocations. We review theoretical claims and contrast them with practical work, showing our efforts to model collocations in an adequate and comprehensive way. Experimental results show the efficiency of our approach and the impact of collocational knowledge on the performance of parsing and machine translation.
Databáze: OpenAIRE