Zobrazeno 1 - 10
of 74
pro vyhledávání: '"España Bonet, Cristina"'
Translated texts exhibit systematic linguistic differences compared to original texts in the same language, and these differences are referred to as translationese. Translationese has effects on various cross-lingual natural language processing tasks
Externí odkaz:
http://arxiv.org/abs/2310.18830
Autor:
España-Bonet, Cristina
Neutrality is difficult to achieve and, in politics, subjective. Traditional media typically adopt an editorial line that can be used by their potential readers as an indicator of the media bias. Several platforms currently rate news outlets accordin
Externí odkaz:
http://arxiv.org/abs/2310.16269
Recent work has shown evidence of 'Clever Hans' behavior in high-performance neural translationese classifiers, where BERT-based classifiers capitalize on spurious correlations, in particular topic information, between data and target classification
Externí odkaz:
http://arxiv.org/abs/2308.13170
Most existing approaches for unsupervised bilingual lexicon induction (BLI) depend on good quality static or contextual embeddings requiring large monolingual corpora for both languages. However, unsupervised BLI is most likely to be useful for low-r
Externí odkaz:
http://arxiv.org/abs/2305.14012
Dense vector representations for textual data are crucial in modern NLP. Word embeddings and sentence embeddings estimated from raw texts are key in achieving state-of-the-art results in various tasks requiring semantic understanding. However, obtain
Externí odkaz:
http://arxiv.org/abs/2304.14796
Recent work has shown that neural feature- and representation-learning, e.g. BERT, achieves superior performance over traditional manual feature engineering based approaches, with e.g. SVMs, in translationese classification tasks. Previous research d
Externí odkaz:
http://arxiv.org/abs/2210.13391
Autor:
Ruiter, Dana, Kleinbauer, Thomas, España-Bonet, Cristina, van Genabith, Josef, Klakow, Dietrich
Recent research on style transfer takes inspiration from unsupervised neural machine translation (UNMT), learning from large amounts of non-parallel data by exploiting cycle consistency loss, back-translation, and denoising autoencoders. By contrast,
Externí odkaz:
http://arxiv.org/abs/2205.08814
Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets. However, compared to original texts in the same language, translations posses
Externí odkaz:
http://arxiv.org/abs/2205.08001
Autor:
Pylypenko, Daria, Amponsah-Kaakyire, Kwabena, Chowdhury, Koel Dutta, van Genabith, Josef, España-Bonet, Cristina
Traditional hand-crafted linguistically-informed features have often been used for distinguishing between translated and original non-translated texts. By contrast, to date, neural architectures without manual feature engineering have been less explo
Externí odkaz:
http://arxiv.org/abs/2109.07604
For most language combinations, parallel data is either scarce or simply unavailable. To address this, unsupervised machine translation (UMT) exploits large amounts of monolingual data by using synthetic data generation techniques such as back-transl
Externí odkaz:
http://arxiv.org/abs/2107.08772