Zobrazeno 1 - 10
of 51
pro vyhledávání: '"Sirts, Kairit"'
Autor:
Dorkin, Aleksei, Sirts, Kairit
We present our submission to the AXOLOTL-24 shared task. The shared task comprises two subtasks: identifying new senses that words gain with time (when comparing newer and older time periods) and producing the definitions for the identified new sense
Externí odkaz:
http://arxiv.org/abs/2407.03861
Autor:
Sharma, Neha, Sirts, Kairit
Research exploring linguistic markers in individuals with depression has demonstrated that language usage can serve as an indicator of mental health. This study investigates the impact of discussion topic as context on linguistic markers and emotiona
Externí odkaz:
http://arxiv.org/abs/2405.18061
Autor:
Dorkin, Aleksei, Sirts, Kairit
This paper presents the TartuNLP team submission to EvaLatin 2024 shared task of the emotion polarity detection for historical Latin texts. Our system relies on two distinct approaches to annotating training data for supervised learning: 1) creating
Externí odkaz:
http://arxiv.org/abs/2405.01159
Autor:
Dorkin, Aleksei, Sirts, Kairit
We present an information retrieval based reverse dictionary system using modern pre-trained language models and approximate nearest neighbors search algorithms. The proposed approach is applied to an existing Estonian language lexicon resource, S\~o
Externí odkaz:
http://arxiv.org/abs/2404.19430
This paper explores the impact of incorporating sentiment, emotion, and domain-specific lexicons into a transformer-based model for depression symptom estimation. Lexicon information is added by marking the words in the input transcripts of patient-t
Externí odkaz:
http://arxiv.org/abs/2404.19359
Autor:
Dorkin, Aleksei, Sirts, Kairit
Publikováno v:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pp. 280-285, May 2023
This study evaluates three different lemmatization approaches to Estonian -- Generative character-level models, Pattern-based word-level classification models, and rule-based morphological analysis. According to our experiments, a significantly small
Externí odkaz:
http://arxiv.org/abs/2404.15003
Autor:
Dorkin, Aleksei, Sirts, Kairit
Publikováno v:
Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pp. 120-130, March 2024
We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. W
Externí odkaz:
http://arxiv.org/abs/2404.12845
This paper addresses the quality of annotations in mental health datasets used for NLP-based depression level estimation from social media texts. While previous research relies on social media-based datasets annotated with binary categories, i.e. dep
Externí odkaz:
http://arxiv.org/abs/2403.00438
Autor:
Milintsevich, Kirill, Sirts, Kairit
We propose a novel hybrid approach to lemmatization that enhances the seq2seq neural model with additional lemmas extracted from an external lexicon or a rule-based system. During training, the enhanced lemmatizer learns both to generate lemmas via a
Externí odkaz:
http://arxiv.org/abs/2101.12056
Autor:
Sirts, Kairit, Peekman, Kairit
Texts obtained from web are noisy and do not necessarily follow the orthographic sentence and word boundary rules. Thus, sentence segmentation and word tokenization systems that have been developed on well-formed texts might not perform so well on un
Externí odkaz:
http://arxiv.org/abs/2011.07868