Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Li, Jiaoda"'
Large language models (LLMs) exhibit an intriguing ability to learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL). Understandably, a swath of research has been dedicated to uncovering the theorie
Externí odkaz:
http://arxiv.org/abs/2406.04216
Natural languages are believed to be (mildly) context-sensitive. Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modelin
Externí odkaz:
http://arxiv.org/abs/2405.04515
Autor:
Hou, Yifan, Li, Jiaoda, Fei, Yu, Stolfo, Alessandro, Zhou, Wangchunshu, Zeng, Guangtao, Bosselut, Antoine, Sachan, Mrinmaya
Recent work has shown that language models (LMs) have strong multi-step (i.e., procedural) reasoning capabilities. However, it is unclear whether LMs perform these tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step
Externí odkaz:
http://arxiv.org/abs/2310.14491
Probing is a popular method to discern what linguistic information is contained in the representations of pre-trained language models. However, the mechanism of selecting the probe model has recently been subject to intense debate, as it is not clear
Externí odkaz:
http://arxiv.org/abs/2207.01736
Multimodal machine translation (MMT) systems have been shown to outperform their text-only neural machine translation (NMT) counterparts when visual context is available. However, recent studies have also shown that the performance of MMT models is o
Externí odkaz:
http://arxiv.org/abs/2109.03415
Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer. Recent work has shown, however, that a large proportion of the heads in a Transfor
Externí odkaz:
http://arxiv.org/abs/2108.04657