Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Isaac Caswell"'
Publikováno v:
COLING
Large text corpora are increasingly important for a wide variety of Natural Language Processing (NLP) tasks, and automatic language identification (LangID) is a core technology needed to collect such datasets in a multilingual context. LangID is larg
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::25cfb0764a492e7b9e8ed9f2843fdbdc
http://arxiv.org/abs/2010.14571
http://arxiv.org/abs/2010.14571
Publikováno v:
EMNLP (1)
The quality of automatic metrics for machine translation has been increasingly called into question, especially for high-quality systems. This paper demonstrates that, while choice of metric is important, the nature of the references is also critical
Publikováno v:
EMNLP/IJCNLP (1)
Multilingual Neural Machine Translation (NMT) models have yielded large empirical success in transfer learning settings. However, these black-box representations are poorly understood, and their mode of transfer remains elusive. In this work, we atte
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::06bf1b6717501894c86c27917895fe79
http://arxiv.org/abs/1909.02197
http://arxiv.org/abs/1909.02197
Publikováno v:
WMT (1)
In this work, we train an Automatic Post-Editing (APE) model and use it to reveal biases in standard Machine Translation (MT) evaluation procedures. The goal of our APE model is to correct typical errors introduced by the translation process, and con
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::16bacf7c0425eb8b3d1d7b6aad105114
http://arxiv.org/abs/1904.04790
http://arxiv.org/abs/1904.04790
Publikováno v:
ACL
Most data selection research in machine translation focuses on improving a single domain. We perform data selection for multiple domains at once. This is achieved by carefully introducing instance-level domain-relevance features and automatically con
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fe436d1d2654cd91bbd54e80a7a7fb7e
Publikováno v:
ACL (1)
Noise and domain are important aspects of data quality for neural machine translation. Existing research focus separately on domain-data selection, clean-data selection, or their static combination, leaving the dynamic interaction across them not exp
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::213ae8dd704c7dfaee19b092566bb6e6
Publikováno v:
WMT (1)
Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data. We show that the main role of such synthetic noise is not to diversif
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::17cb9e745e5422c553c84deb26579fde