A Multilingual Integrated Framework for Processing Lexical Collocations
Autor: | Violeta Seretan |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
Flexibility (engineering)
Parsing Machine translation business.industry Computer science media_common.quotation_subject Parse tree Contrast (statistics) computer.software_genre Reading (process) Key (cryptography) Artificial intelligence ddc:410.2 Computational linguistics business computer Natural language processing media_common |
Zdroj: | Computational Linguistics-Applications pp. 87-108 Computational Linguistics ISBN: 9783642343988 Computational Linguistics-Applications |
Popis: | Lexical collocations are typical combinations of words, such as "heavy rain", "close collaboration", or "to meet a deadline". Pervasive in language, they are a key issue for NLP systems since, as other types of multi-word expressions like idioms, they do not allow for word-by-word processing. We present a multilingual framework that lays emphasis on the accurate acquisition of collocational knowledge from corpora and its exploitation in two large-scale applications (parsing and machine translation), as well as for lexicographic support and for reading assistance. The underlying methodology departs from mainstream approaches by relying on deep parsing to cope with the high morphosyntactic flexibility of collocations. We review theoretical claims and contrast them with practical work, showing our efforts to model collocations in an adequate and comprehensive way. Experimental results show the efficiency of our approach and the impact of collocational knowledge on the performance of parsing and machine translation. |
Databáze: | OpenAIRE |
Externí odkaz: |