Zobrazeno 1 - 10
of 186
pro vyhledávání: '"Honovich, Or"'
Scaling inference compute in large language models (LLMs) through repeated sampling consistently increases the coverage (fraction of problems solved) as the number of samples increases. We conjecture that this observed improvement is partially due to
Externí odkaz:
http://arxiv.org/abs/2410.15466
Autor:
Jacovi, Alon, Bitton, Yonatan, Bohnet, Bernd, Herzig, Jonathan, Honovich, Or, Tseng, Michael, Collins, Michael, Aharoni, Roee, Geva, Mor
Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literature discusse
Externí odkaz:
http://arxiv.org/abs/2402.00559
Ensuring that large language models (LMs) are fair, robust and useful requires an understanding of how different modifications to their inputs impact the model's behaviour. In the context of open-text generation tasks, however, such an evaluation is
Externí odkaz:
http://arxiv.org/abs/2305.07378
Instruction tuning enables pretrained language models to perform new tasks from inference-time natural language descriptions. These approaches rely on vast amounts of human supervision in the form of crowdsourced datasets or user interactions. In thi
Externí odkaz:
http://arxiv.org/abs/2212.09689
Question answering models commonly have access to two sources of "knowledge" during inference time: (1) parametric knowledge - the factual knowledge encoded in the model weights, and (2) contextual knowledge - external knowledge (e.g., a Wikipedia pa
Externí odkaz:
http://arxiv.org/abs/2211.05655
As the performance of large language models rapidly improves, benchmarks are getting larger and more complex as well. We present LMentry, a benchmark that avoids this "arms race" by focusing on a compact set of tasks that are trivial to humans, e.g.
Externí odkaz:
http://arxiv.org/abs/2211.02069
Large language models are able to perform a task by conditioning on a few input-output demonstrations - a paradigm known as in-context learning. We show that language models can explicitly infer an underlying task from a few demonstrations by prompti
Externí odkaz:
http://arxiv.org/abs/2205.10782
Autor:
Honovich, Or, Aharoni, Roee, Herzig, Jonathan, Taitelbaum, Hagai, Kukliansy, Doron, Cohen, Vered, Scialom, Thomas, Szpektor, Idan, Hassidim, Avinatan, Matias, Yossi
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability. Automatic factual consistency evaluation may help alleviate this limitation by accelerating evaluation cycles, filte
Externí odkaz:
http://arxiv.org/abs/2204.04991
Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability. Inspired by recent work on evaluating factual cons
Externí odkaz:
http://arxiv.org/abs/2104.08202
Publikováno v:
In Midwifery September 2019 76:132-141