Zobrazeno 1 - 10
of 140
pro vyhledávání: '"Monz, Christof."'
This paper explores the impact of variable pragmatic competence on communicative success through simulating language learning and conversing between speakers and listeners with different levels of reasoning abilities. Through studying this interactio
Externí odkaz:
http://arxiv.org/abs/2410.05851
Extremely low-resource (XLR) languages lack substantial corpora for training NLP models, motivating the use of all available resources such as dictionaries and grammar books. Machine Translation from One Book (Tanzer et al., 2024) suggests prompting
Externí odkaz:
http://arxiv.org/abs/2409.19151
Autor:
Liao, Baohao, Monz, Christof
Parameter-efficient finetuning (PEFT) methods effectively adapt large language models (LLMs) to diverse downstream tasks, reducing storage and GPU memory demands. Despite these advantages, several applications pose new challenges to PEFT beyond mere
Externí odkaz:
http://arxiv.org/abs/2409.00119
This paper introduces two multilingual systems, IKUN and IKUN-C, developed for the general machine translation task in WMT24. IKUN and IKUN-C represent an open system and a constrained system, respectively, built on Llama-3-8b and Mistral-7B-v0.3. Bo
Externí odkaz:
http://arxiv.org/abs/2408.11512
The massive amounts of web-mined parallel data contain large amounts of noise. Semantic misalignment, as the primary source of the noise, poses a challenge for training machine translation systems. In this paper, we first study the impact of real-wor
Externí odkaz:
http://arxiv.org/abs/2407.02208
Autor:
Chen, Xinyi, Liao, Baohao, Qi, Jirui, Eustratiadis, Panagiotis, Monz, Christof, Bisazza, Arianna, de Rijke, Maarten
Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bias where the order of instruction
Externí odkaz:
http://arxiv.org/abs/2406.19999
While multilingual language models (MLMs) have been trained on 100+ languages, they are typically only evaluated across a handful of them due to a lack of available test data in most languages. This is particularly problematic when assessing MLM's po
Externí odkaz:
http://arxiv.org/abs/2406.14267
Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine transla
Externí odkaz:
http://arxiv.org/abs/2405.20089
Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity
Externí odkaz:
http://arxiv.org/abs/2404.11201
Autor:
Liao, Baohao, Monz, Christof
With the growing size of large language models, the role of quantization becomes increasingly significant. However, outliers present in weights or activations notably influence the performance of quantized models. Recently, \citet{qtransformer} intro
Externí odkaz:
http://arxiv.org/abs/2402.12102