Výsledky vyhledávání - "Matej Ulčar"

Akademický článek

Sequence-to-sequence pretraining for a less-resourced Slovenian language

Autor: Matej Ulčar, Marko Robnik-Šikonja

Publikováno v: Frontiers in Artificial Intelligence, Vol 6 (2023)

IntroductionLarge pretrained language models have recently conquered the area of natural language processing. As an alternative to predominant masked language modeling introduced in BERT, the T5 model has introduced a more general training objective,

Externí odkaz: https://doaj.org/article/52757f1102d44804861f566763d0faf5

Zobrazit plný text záznamu

Akademický článek

Slovene and Croatian word embeddings in terms of gender occupational analogies

Autor: Matej Ulčar, Anka Supej, Marko Robnik-Šikonja, Senja Pollak

Publikováno v: Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave, Vol 9, Iss 1 (2021)

In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capt

Externí odkaz: https://doaj.org/article/9d4e22fbe85e43159fb32dc66266b2c4

Zobrazit plný text záznamu

Training Dataset and Dictionary Sizes Matter in BERT Models: The Case of Baltic Languages

Autor: Matej Ulčar, Marko Robnik-Šikonja

Publikováno v: Lecture Notes in Computer Science ISBN: 9783031164996

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::641abc5c15e10515e661c1e46d93b440
https://doi.org/10.1007/978-3-031-16500-9_14

Zobrazit plný text záznamu

Slovene and Croatian word embeddings in terms of gender occupational analogies

Autor: Marko Robnik-Šikonja, Matej Ulčar, Anka Supej, Senja Pollak

Publikováno v: Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave, Vol 9, Iss 1 (2021)
Slovenščina 2.0: empirical, applied and interdisciplinary research

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::92b75fae291273457768e9d917bb5a71
https://revije.ff.uni-lj.si/slovenscina2/article/view/9883

Zobrazit plný text záznamu

SemEval-2020 Task 3: Graded Word Similarity in Context

Autor: Senja Pollak, Matthew Purver, Nikola Ljubešić, Mohammad Taher Pilehvar, Ivan Vulić, Matej Ulčar, Carlos Santos Armendariz

Publikováno v: SemEval@COLING

This paper presents the Graded Word Similarity in Context (GWSC) task which asked participants to predict the effects of context on human perception of similarity in English, Croatian, Slovene and Finnish. We received 15 submissions and 11 system des

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::cbc11035b6b4cfeddde7dfa9ae10c019

Zobrazit plný text záznamu

FinEst BERT and CroSloEngual BERT: less is more in multilingual models

Autor: Marko Robnik-Šikonja, Matej Ulčar

Publikováno v: Text, Speech, and Dialogue-23rd International Conference, TSD 2020, Brno, Czech Republic, September 8–11, 2020, Proceedings
Text, Speech, and Dialogue ISBN: 9783030583224
TDS

Large pretrained masked language models have become state-of-the-art solutions for many NLP problems. The research has been mostly focused on English language, though. While massively multilingual models exist, studies have shown that monolingual mod

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2a308fec22ba03acf95560ee91643aed
http://arxiv.org/abs/2006.07890

Zobrazit plný text záznamu

CoSimLex: A Resource for Evaluating Graded Word Similarity in Context

Autor: Carlos Santos Armendariz, Matthew Purver, Matej Ulčar, Senja Pollak, Nikola Ljubešič, Marko Robnik-Šikonja, Mark Granroth-Wilding, Kristiina Vaik

Publikováno v: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)

State of the art natural language processing tools are built on context-dependent word embeddings, but no direct method for evaluating these representations currently exists. Standard tasks and datasets for intrinsic evaluation of embeddings are base

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2cf2aa091b6f4d44c591dbded6c64133
https://doi.org/10.5281/zenodo.3894565

Zobrazit plný text záznamu

LitLat BERT

Autor: Matej Ulčar, Marko Robnik-Šikonja

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2b1d7e103dfe2f4bb39fc23f2c11a9f1
https://doi.org/10.7220/20.500.12259/240312

Zobrazit plný text záznamu

Razpoznavanje slovenskega govora z metodami globokih nevronskih mrež

Autor: Matej Ulčar, Simon Dobrišek, Marko Robnik-Šikonja

Publikováno v: Uporabna informatika (Ljubljana)

V zadnjem času se na področju samodejnega razpoznavanja govora uveljavljajo globoke nevronske mreže, ki nadomeščajo akustično modeliranje z uporabo HMM in GMM modelov ter n-grame za jezikovni model. Za razpoznavanje govorjene slovenščine smo

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::559c0811fd8c9fbddd22ccfa67158817
http://www.dlib.si/details/URN:NBN:SI:doc-WV49C0GL

Zobrazit plný text záznamu

Cross-lingual alignments of ELMo contextual embeddings

Autor: Matej Ulčar, Marko Robnik-Šikonja

Publikováno v: Neural Computing and Applications.

Building machine learning prediction models for a specific NLP task requires sufficient training data, which can be difficult to obtain for less-resourced languages. Cross-lingual embeddings map word embeddings from a less-resourced language to a res

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0c2deef1f71fa990efcebd41cee3fc32

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání