Zobrazeno 1 - 10
of 70
pro vyhledávání: '"Štefánik, Michal"'
Autor:
Kadlčík, Marek, Štefánik, Michal
Recent language models achieve impressive results in tasks involving complex multistep reasoning, but scaling these capabilities further traditionally requires expensive collection of more annotated data. In this work, we explore the potential of imp
Externí odkaz:
http://arxiv.org/abs/2407.08400
Many recent language models (LMs) are capable of in-context learning (ICL), manifested in the LMs' ability to perform a new task solely from natural-language instruction. Previous work curating in-context learners assumes that ICL emerges from a vast
Externí odkaz:
http://arxiv.org/abs/2403.09703
Although pre-trained named entity recognition (NER) models are highly accurate on modern corpora, they underperform on historical texts due to differences in language OCR errors. In this work, we develop a new NER corpus of 3.6M sentences from late m
Externí odkaz:
http://arxiv.org/abs/2305.16718
Despite outstanding performance in many tasks, language models are notoriously inclined to make factual errors in tasks requiring arithmetic computation. We address this deficiency by creating Calc-X, a collection of datasets that demonstrates the ap
Externí odkaz:
http://arxiv.org/abs/2305.15017
Autor:
Štefánik, Michal, Kadlčík, Marek
Many recent language models (LMs) of Transformers family exhibit so-called in-context learning (ICL) ability, manifested in the LMs' ability to modulate their function by a task described in a natural language input. Previous work curating these mode
Externí odkaz:
http://arxiv.org/abs/2305.13775
While the Large Language Models (LLMs) dominate a majority of language understanding tasks, previous work shows that some of these results are supported by modelling spurious correlations of training datasets. Authors commonly assess model robustness
Externí odkaz:
http://arxiv.org/abs/2305.06841
Despite the rapid recent progress in creating accurate and compact in-context learners, most recent work focuses on in-context learning (ICL) for tasks in English. However, the ability to interact with users of languages outside English presents a gr
Externí odkaz:
http://arxiv.org/abs/2304.01922
Autor:
Štefánik, Michal, Kadlčík, Marek
Language models exhibit an emergent ability to learn a new task from a small number of input-output demonstrations. However, recent work shows that in-context learners largely rely on their pre-trained knowledge, such as the sentiment of the labels,
Externí odkaz:
http://arxiv.org/abs/2212.01692
Domain adaptation allows generative language models to address specific flaws caused by the domain shift of their application. However, the traditional adaptation by further training on in-domain data rapidly weakens the model's ability to generalize
Externí odkaz:
http://arxiv.org/abs/2211.16550
Autor:
Štefánik, Michal
Despite their outstanding performance, large language models (LLMs) suffer notorious flaws related to their preference for simple, surface-level textual relations over full semantic complexity of the problem. This proposal investigates a common denom
Externí odkaz:
http://arxiv.org/abs/2206.08446