Zobrazeno 1 - 10
of 81
pro vyhledávání: '"Mikolov, Tomas"'
Who is the US President? The answer changes depending on when the question is asked. While large language models (LLMs) are evaluated on various reasoning tasks, they often miss a crucial dimension: time. In real-world scenarios, the correctness of a
Externí odkaz:
http://arxiv.org/abs/2409.13338
Autor:
Herel, David, Mikolov, Tomas
How much is 56 times 37? Language models often make mistakes in these types of difficult calculations. This is usually explained by their inability to perform complex reasoning. Since language models rely on large training sets and great memorization
Externí odkaz:
http://arxiv.org/abs/2405.08644
Autor:
Herel, David, Mikolov, Tomas
In various fields of knowledge creation, including science, new ideas often build on pre-existing information. In this work, we explore this concept within the context of language models. Specifically, we explore the potential of self-training models
Externí odkaz:
http://arxiv.org/abs/2404.02305
Autor:
Minaee, Shervin, Mikolov, Tomas, Nikzad, Narjes, Chenaghlu, Meysam, Socher, Richard, Amatriain, Xavier, Gao, Jianfeng
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks, since the release of ChatGPT in November 2022. LLMs' ability of general-purpose language understanding and generatio
Externí odkaz:
http://arxiv.org/abs/2402.06196
Autor:
Herel, David, Mikolov, Tomas
Generalization is arguably the most important goal of statistical language modeling research. Publicly available benchmarks and papers published with an open-source code have been critical to advancing the field. However, it is often very difficult,
Externí odkaz:
http://arxiv.org/abs/2312.03735
Publikováno v:
ECAI 2023
The growth of hateful online content, or hate speech, has been associated with a global increase in violent crimes against minorities [23]. Harmful online content can be produced easily, automatically and anonymously. Even though, some form of auto-d
Externí odkaz:
http://arxiv.org/abs/2211.04205
It is common to evaluate the performance of a machine learning model by measuring its predictive power on a test dataset. This approach favors complicated models that can smoothly fit complex functions and generalize well from training data points. A
Externí odkaz:
http://arxiv.org/abs/2210.02549
Publikováno v:
Artificial Life Conference Proceedings 2022. MIT Press
One of the main problems of evolutionary algorithms is the convergence of the population to local minima. In this paper, we explore techniques that can avoid this problem by encouraging a diverse behavior of the agents through a shared reward system.
Externí odkaz:
http://arxiv.org/abs/2207.04857
Autor:
Yorsh, Uladzislau, Kovalenko, Alexander, Vančura, Vojtěch, Vašata, Daniel, Kordík, Pavel, Mikolov, Tomáš
In this paper, we propose that the dot product pairwise matching attention layer, which is widely used in Transformer-based models, is redundant for the model performance. Attention, in its original formulation, has to be seen rather as a human-level
Externí odkaz:
http://arxiv.org/abs/2111.15588
Autor:
Hudcová, Barbora, Mikolov, Tomáš
In order to develop systems capable of artificial evolution, we need to identify which systems can produce complex behavior. We present a novel classification method applicable to any class of deterministic discrete space and time dynamical systems.
Externí odkaz:
http://arxiv.org/abs/2108.01573