Zobrazeno 1 - 10
of 367
pro vyhledávání: '"A. van Genabith"'
In the era of high performing Large Language Models, researchers have widely acknowledged that contextual word representations are one of the key drivers in achieving top performances in downstream tasks. In this work, we investigate the degree of co
Externí odkaz:
http://arxiv.org/abs/2409.14097
Autor:
Wang, Qianli, Anikina, Tatiana, Feldhus, Nils, van Genabith, Josef, Hennig, Leonhard, Möller, Sebastian
Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding (Slack et al., 2023; Shen et al., 2023), as one-off explanations may fall short in providing sufficient inform
Externí odkaz:
http://arxiv.org/abs/2401.12576
Pre-trained Language Models (PLMs) have shown to be consistently successful in a plethora of NLP tasks due to their ability to learn contextualized representations of words (Ethayarajh, 2019). BERT (Devlin et al., 2018), ELMo (Peters et al., 2018) an
Externí odkaz:
http://arxiv.org/abs/2312.06514
Pretrained language models (PLMs) form the basis of most state-of-the-art NLP technologies. Nevertheless, they are essentially black boxes: Humans do not have a clear understanding of what knowledge is encoded in different parts of the models, especi
Externí odkaz:
http://arxiv.org/abs/2311.08240
Translated texts exhibit systematic linguistic differences compared to original texts in the same language, and these differences are referred to as translationese. Translationese has effects on various cross-lingual natural language processing tasks
Externí odkaz:
http://arxiv.org/abs/2310.18830
Recent work has shown evidence of 'Clever Hans' behavior in high-performance neural translationese classifiers, where BERT-based classifiers capitalize on spurious correlations, in particular topic information, between data and target classification
Externí odkaz:
http://arxiv.org/abs/2308.13170
Autor:
Sagar, Sangeet, Ravanelli, Mirco, Kiefer, Bernd, Korbayova, Ivana Kruijff, van Genabith, Josef
Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments. This poses a particular challenge in the search and res
Externí odkaz:
http://arxiv.org/abs/2306.04054
Most existing approaches for unsupervised bilingual lexicon induction (BLI) depend on good quality static or contextual embeddings requiring large monolingual corpora for both languages. However, unsupervised BLI is most likely to be useful for low-r
Externí odkaz:
http://arxiv.org/abs/2305.14012
Dense vector representations for textual data are crucial in modern NLP. Word embeddings and sentence embeddings estimated from raw texts are key in achieving state-of-the-art results in various tasks requiring semantic understanding. However, obtain
Externí odkaz:
http://arxiv.org/abs/2304.14796
Document-level neural machine translation (NMT) has outperformed sentence-level NMT on a number of datasets. However, document-level NMT is still not widely adopted in real-world translation systems mainly due to the lack of large-scale general-domai
Externí odkaz:
http://arxiv.org/abs/2304.10216