Zobrazeno 1 - 10
of 150
pro vyhledávání: '"Cohen, Shay B."'
Code-LLMs, LLMs pre-trained on large code corpora, have shown great progress in learning rich representations of the structure and syntax of code, successfully using it to generate or classify code fragments. At the same time, understanding if they a
Externí odkaz:
http://arxiv.org/abs/2408.11081
We introduce a dataset comprising commercial machine translations, gathered weekly over six years across 12 translation directions. Since human A/B testing is commonly used, we assume commercial systems improve over time, which enables us to evaluate
Externí odkaz:
http://arxiv.org/abs/2407.03277
Autor:
Ericsson, Linus, Espinosa, Miguel, Yang, Chenhongyi, Antoniou, Antreas, Storkey, Amos, Cohen, Shay B., McDonagh, Steven, Crowley, Elliot J.
Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces i
Externí odkaz:
http://arxiv.org/abs/2405.20838
Large language models (LLMs) often exhibit undesirable behaviours, such as generating untruthful or biased content. Editing their internal representations has been shown to be effective in mitigating such behaviours on top of the existing alignment m
Externí odkaz:
http://arxiv.org/abs/2405.09719
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 7, Pp 73-89 (2019)
Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head and (ii) bilexical statistics are crucial to solve ambiguities. In this paper, we introduce an unlexicalized transition-based parser for
Externí odkaz:
https://doaj.org/article/1af7e7274c5d4be2a51fc1ca004a0d72
Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning. We use Lean, a theorem proving framework, to address these challenges. By formalizing logical rea
Externí odkaz:
http://arxiv.org/abs/2403.13312
Autor:
Gyevnar, Balint, Droop, Stephanie, Quillien, Tadeg, Cohen, Shay B., Bramley, Neil R., Lucas, Christopher G., Albrecht, Stefano V.
Cognitive science can help us understand which explanations people might expect, and in which format they frame these explanations, whether causal, counterfactual, or teleological (i.e., purpose-oriented). Understanding the relevance of these concept
Externí odkaz:
http://arxiv.org/abs/2403.08828
In this paper, we investigate the interplay between attention heads and specialized "next-token" neurons in the Multilayer Perceptron that predict specific tokens. By prompting an LLM like GPT-4 to explain these model internals, we can elucidate atte
Externí odkaz:
http://arxiv.org/abs/2402.15055
Extractive summaries are usually presented as lists of sentences with no expected cohesion between them. In this paper, we aim to enforce cohesion whilst controlling for informativeness and redundancy in summaries, in cases where the input exhibits h
Externí odkaz:
http://arxiv.org/abs/2402.10643
Autor:
Fonseca, Marcio, Cohen, Shay B.
In this work, we investigate the controllability of large language models (LLMs) on scientific summarization tasks. We identify key stylistic and content coverage factors that characterize different types of summaries such as paper reviews, abstracts
Externí odkaz:
http://arxiv.org/abs/2401.10415