Zobrazeno 1 - 10
of 986
pro vyhledávání: '"Yamshchikov ON"'
Autor:
Vikhorev, Dmitry, Galimzianova, Daria, Gorovaia, Svetlana, Zhemchuzhina, Elizaveta, Yamshchikov, Ivan P.
Humor generation is a challenging task in natural language processing due to limited resources and the quality of existing datasets. Available humor language resources often suffer from toxicity and duplication, limiting their effectiveness for train
Externí odkaz:
http://arxiv.org/abs/2412.09203
Open-source large language models are becoming increasingly available and popular among researchers and practitioners. While significant progress has been made on open-weight models, open training data is a practice yet to be adopted by the leading o
Externí odkaz:
http://arxiv.org/abs/2410.22587
This paper evaluates the performance of Large Language Models (LLMs) in authorship attribution and authorship verification tasks for Latin texts of the Patristic Era. The study showcases that LLMs can be robust in zero-shot authorship verification ev
Externí odkaz:
http://arxiv.org/abs/2410.09245
We show differences between a language-and-vision model CLIP and two text-only models - FastText and SBERT - when it comes to the encoding of individuation information. We study latent representations that CLIP provides for substrates, granular aggre
Externí odkaz:
http://arxiv.org/abs/2409.18868
Language models can largely benefit from efficient tokenization. However, they still mostly utilize the classical BPE algorithm, a simple and reliable method. This has been shown to cause such issues as under-trained tokens and sub-optimal compressio
Externí odkaz:
http://arxiv.org/abs/2409.04599
With the rise of computational social science, many scholars utilize data analysis and natural language processing tools to analyze social media, news articles, and other accessible data sources for examining political and social discourse. Particula
Externí odkaz:
http://arxiv.org/abs/2404.03437
This paper investigates the communication styles and structures of Twitter (X) communities within the vaccination context. While mainstream research primarily focuses on the echo-chamber phenomenon, wherein certain ideas are reinforced and participan
Externí odkaz:
http://arxiv.org/abs/2403.19423
Autor:
Surkov, Maxim K., Yamshchikov, Ivan P.
Evaluation plays a significant role in modern natural language processing. Most modern NLP benchmarks consist of arbitrary sets of tasks that neither guarantee any generalization potential for the model once applied outside the test set nor try to mi
Externí odkaz:
http://arxiv.org/abs/2402.14890
An empirical investigation into the simulation of the Big Five personality traits by large language models (LLMs), namely Llama2, GPT4, and Mixtral, is presented. We analyze the personality traits simulated by these models and their stability. This c
Externí odkaz:
http://arxiv.org/abs/2402.01765
This study explores four methods of generating paraphrases in Malayalam, utilizing resources available for English paraphrasing and pre-trained Neural Machine Translation (NMT) models. We evaluate the resulting paraphrases using both automated metric
Externí odkaz:
http://arxiv.org/abs/2401.17827