Zobrazeno 1 - 10
of 47
pro vyhledávání: '"Yangarber, Roman"'
Assessment of proficiency of the learner is an essential part of Intelligent Tutoring Systems (ITS). We use Item Response Theory (IRT) in computer-aided language learning for assessment of student ability in two contexts: in test sessions, and in exe
Externí odkaz:
http://arxiv.org/abs/2409.16133
The remarkable capabilities of Large Language Models (LLMs) in text comprehension and generation have revolutionized Information Extraction (IE). One such advancement is in Document-level Relation Triplet Extraction (DocRTE), a critical task in infor
Externí odkaz:
http://arxiv.org/abs/2409.13717
Autor:
Katinskaia, Anisia, Yangarber, Roman
We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian. Encoding of aspect in transformer LMs has not been studied previously in any language. A particular challenge is posed by "alternative co
Externí odkaz:
http://arxiv.org/abs/2406.02335
Autor:
Katinskaia, Anisia, Yangarber, Roman
This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models.
Externí odkaz:
http://arxiv.org/abs/2405.08469
Autor:
Hou, Jue, Katinskaia, Anisia, Kotilainen, Lari, Trangcasanchai, Sathianpong, Vu, Anh-Duc, Yangarber, Roman
This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government rel
Externí odkaz:
http://arxiv.org/abs/2404.14270
This paper presents a corpus manually annotated with named entities for six Slavic languages - Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017-2023 as a part of the W
Externí odkaz:
http://arxiv.org/abs/2404.00482
Publikováno v:
This submission published in EMNLP 2023
Language modeling is a fundamental task in natural language processing, which has been thoroughly explored with various architectures and hyperparameters. However, few studies focus on the effect of sub-word segmentation on the performance of languag
Externí odkaz:
http://arxiv.org/abs/2305.05480
This paper presents the development of an AI-based language learning platform Revita. It is a freely available intelligent online tutor, developed to support learners of multiple languages, from low-intermediate to advanced levels. It has been in pil
Externí odkaz:
http://arxiv.org/abs/2212.01711
Autor:
Kylliäinen, Ilmari, Yangarber, Roman
Recent advances in the field of language modeling have improved the state-of-the-art in question answering (QA) and question generation (QG). However, the development of modern neural models, their benchmarks, and datasets for training them has mainl
Externí odkaz:
http://arxiv.org/abs/2211.13794
Publikováno v:
Proceedings of LREC-2020: the 12th Conference on Language Resources and Evaluation
We consider the problem of disambiguating the lemma and part of speech of ambiguous words in morphologically rich languages. We propose a method for disambiguating ambiguous words in context, using a large un-annotated corpus of text, and a morpholog
Externí odkaz:
http://arxiv.org/abs/2007.06104