Recent advances in machine translation using comparable corpora
Autor: | Pierre Zweigenbaum, Reinhard Rapp, Serge Sharoff |
---|---|
Rok vydání: | 2016 |
Předmět: |
Linguistics and Language
Artificial neural network Machine translation Computer science business.industry Comparability Context (language use) 02 engineering and technology computer.software_genre Language and Linguistics Field (computer science) Task (project management) Artificial Intelligence 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Software Word (computer architecture) Natural language processing |
Zdroj: | Natural Language Engineering. 22:501-516 |
ISSN: | 1469-8110 1351-3249 |
DOI: | 10.1017/s1351324916000115 |
Popis: | This paper highlights some of the recent developments in the field of machine translation using comparable corpora. We start by updating previous definitions of comparable corpora and then look at bilingual versions of continuous vector space models. Recently, neural networks have been used to obtain latent context representations with only few dimensions which are often called word embeddings. These promising new techniques cannot only be applied to parallel but also to comparable corpora. Subsequent sections of the paper discuss work specifically targeting at machine translation using comparable corpora, as well as work dealing with the extraction of parallel segments from comparable corpora. Finally, we give an overview on the design and the results of a recent shared task on measuring document comparability across languages. |
Databáze: | OpenAIRE |
Externí odkaz: |