Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Tworkowski, Szymon"'
Autor:
Zhao, Yu, Qu, Yuanbin, Staniszewski, Konrad, Tworkowski, Szymon, Liu, Wei, Miłoś, Piotr, Wu, Yuxiang, Minervini, Pasquale
Most language model pre-training frameworks concatenate multiple documents into fixed-length sequences and use causal masking to compute the likelihood of each token given its context; this strategy is widely adopted due to its simplicity and efficie
Externí odkaz:
http://arxiv.org/abs/2402.13991
Autor:
Staniszewski, Konrad, Tworkowski, Szymon, Jaszczur, Sebastian, Zhao, Yu, Michalewski, Henryk, Kuciński, Łukasz, Miłoś, Piotr
Recent advancements in long-context large language models have attracted significant attention, yet their practical applications often suffer from suboptimal context utilization. This study investigates structuring training data to enhance semantic i
Externí odkaz:
http://arxiv.org/abs/2312.17296
In this paper, we approach competitive-level programming problem-solving as a composite task of reasoning and code generation. We propose a novel method to automatically annotate natural language explanations to \textit{} pairs. We
Externí odkaz:
http://arxiv.org/abs/2307.05337
Autor:
Tworkowski, Szymon, Staniszewski, Konrad, Pacek, Mikołaj, Wu, Yuhuai, Michalewski, Henryk, Miłoś, Piotr
Large language models have an exceptional capability to incorporate new information in a contextual manner. However, the full potential of such an approach is often restrained due to a limitation in the effective context length. One solution to this
Externí odkaz:
http://arxiv.org/abs/2307.03170
Autor:
Mikuła, Maciej, Tworkowski, Szymon, Antoniak, Szymon, Piotrowski, Bartosz, Jiang, Albert Qiaochu, Zhou, Jin Peng, Szegedy, Christian, Kuciński, Łukasz, Miłoś, Piotr, Wu, Yuhuai
This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, th
Externí odkaz:
http://arxiv.org/abs/2303.04488
Autor:
Jiang, Albert Q., Li, Wenda, Tworkowski, Szymon, Czechowski, Konrad, Odrzygóźdź, Tomasz, Miłoś, Piotr, Wu, Yuhuai, Jamnik, Mateja
In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to t
Externí odkaz:
http://arxiv.org/abs/2205.10893
Autor:
Nawrot, Piotr, Tworkowski, Szymon, Tyrolski, Michał, Kaiser, Łukasz, Wu, Yuhuai, Szegedy, Christian, Michalewski, Henryk
Transformer models yield impressive results on many NLP and sequence modeling tasks. Remarkably, Transformers can handle long sequences which allows them to produce long coherent outputs: full paragraphs produced by GPT-3 or well-structured images pr
Externí odkaz:
http://arxiv.org/abs/2110.13711
Autor:
Nawrot, Piotr, Tworkowski, Szymon, Tyrolski, Michał, Kaiser, Łukasz, Wu, Yuhuai, Szegedy, Christian, Michalewski, Henryk
Publikováno v:
Findings of the Association for Computational Linguistics: NAACL 2022.
Transformer models yield impressive results on many NLP and sequence modeling tasks. Remarkably, Transformers can handle long sequences which allows them to produce long coherent outputs: full paragraphs produced by GPT-3 or well-structured images pr