Zobrazeno 1 - 5
of 5
pro vyhledávání: '"Panwar, Madhur"'
The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations. In this paper, we argue that this information is embodied by the subset of inputs that give rise to similar activat
Externí odkaz:
http://arxiv.org/abs/2405.17653
Autor:
Ahuja, Kabir, Balachandran, Vidhisha, Panwar, Madhur, He, Tianxing, Smith, Noah A., Goyal, Navin, Tsvetkov, Yulia
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias. In this work, we investigate sources of in
Externí odkaz:
http://arxiv.org/abs/2404.16367
In-context learning (ICL) is one of the surprising and useful features of large language models and subject of intense research. Recently, stylized meta-learning-like ICL setups have been devised that train transformers on sequences of input-output p
Externí odkaz:
http://arxiv.org/abs/2306.04891
Topic models have been widely used to learn text representations and gain insight into document corpora. To perform topic discovery, most existing neural models either take document bag-of-words (BoW) or sequence of tokens as input followed by variat
Externí odkaz:
http://arxiv.org/abs/2012.01524
In-context learning is one of the surprising and useful features of large language models. How it works is an active area of research. Recently, stylized meta-learning-like setups have been devised that train these models on a sequence of input-outpu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9ff37bc48a05ec0fd1293b24de37d949