Zobrazeno 1 - 10
of 71
pro vyhledávání: '"Ghazvininejad, Marjan"'
Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal systems. The crux of applying autoregressive training in language generation to vi
Externí odkaz:
http://arxiv.org/abs/2408.08459
Diffusion-based language models are emerging as a promising alternative to autoregressive LMs: they approach the competence of autoregressive LMs while offering nuanced controllability at inference time. While autoregressive LMs have benefited immens
Externí odkaz:
http://arxiv.org/abs/2305.14771
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task. However, even given the incredible quantities of data they are trained on, LLMs can stru
Externí odkaz:
http://arxiv.org/abs/2302.07856
Autor:
Meng, Yu, Krishnan, Jitin, Wang, Sinong, Wang, Qifan, Mao, Yuning, Fang, Han, Ghazvininejad, Marjan, Han, Jiawei, Zettlemoyer, Luke
Masked Language Modeling (MLM) has been one of the most prominent approaches for pretraining bidirectional text encoders due to its simplicity and effectiveness. One notable concern about MLM is that the special $\texttt{[MASK]}$ symbol causes a disc
Externí odkaz:
http://arxiv.org/abs/2302.02060
Autor:
Liang, Davis, Gonen, Hila, Mao, Yuning, Hou, Rui, Goyal, Naman, Ghazvininejad, Marjan, Zettlemoyer, Luke, Khabsa, Madian
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages. As these models have increased in parameter count and depth, vocabulary size has remained largely unchanged. This \textit{vocabulary bottleneck} li
Externí odkaz:
http://arxiv.org/abs/2301.10472
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model. For Machine Translation (MT), these
Externí odkaz:
http://arxiv.org/abs/2212.02437
Generative models of code, pretrained on large corpora of programs, have shown great success in translating natural language to code (Chen et al., 2021; Austin et al., 2021; Li et al., 2022, inter alia). While these models do not explicitly incorpora
Externí odkaz:
http://arxiv.org/abs/2204.11454
Recently, there has been a surge of interest in the NLP community on the use of pretrained Language Models (LMs) as Knowledge Bases (KBs). Researchers have shown that LMs trained on a sufficiently large (web) corpus will encode a significant amount o
Externí odkaz:
http://arxiv.org/abs/2204.06031
Current efficient fine-tuning methods (e.g., adapters, prefix-tuning, etc.) have optimized conditional text generation via training a small set of extra parameters of the neural language model, while freezing the rest for efficiency. While showing st
Externí odkaz:
http://arxiv.org/abs/2112.05717
Mined bitexts can contain imperfect translations that yield unreliable training signals for Neural Machine Translation (NMT). While filtering such pairs out is known to improve final model quality, we argue that it is suboptimal in low-resource condi
Externí odkaz:
http://arxiv.org/abs/2111.06787