Zobrazeno 1 - 10
of 38
pro vyhledávání: '"van der Plas, Lonneke"'
Autor:
Jung, Vincent, van der Plas, Lonneke
We study the effect of one type of imbalance often present in real-life multilingual classification datasets: an uneven distribution of labels across languages. We show evidence that fine-tuning a transformer-based Large Language Model (LLM) on a dat
Externí odkaz:
http://arxiv.org/abs/2402.13016
While analogies are a common way to evaluate word embeddings in NLP, it is also of interest to investigate whether or not analogical reasoning is a task in itself that can be learned. In this paper, we test several ways to learn basic analogical reas
Externí odkaz:
http://arxiv.org/abs/2310.05597
Multilingual language models such as mBERT have seen impressive cross-lingual transfer to a variety of languages, but many languages remain excluded from these models. In this paper, we analyse the effect of pre-training with monolingual data for a l
Externí odkaz:
http://arxiv.org/abs/2205.10517
Autor:
DeMarco, Andrea, Mena, Carlos, Gatt, Albert, Borg, Claudia, Williams, Aiden, van der Plas, Lonneke
Recent years have seen an increased interest in the computational speech processing of Maltese, but resources remain sparse. In this paper, we consider data augmentation techniques for improving speech recognition for low-resource languages, focusing
Externí odkaz:
http://arxiv.org/abs/2111.07793
Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks
Externí odkaz:
http://arxiv.org/abs/2109.06935
This paper presents a novel scheme for the annotation of hate speech in corpora of Web 2.0 commentary. The proposed scheme is motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTI
Externí odkaz:
http://arxiv.org/abs/2008.06222
Autor:
Mena, Carlos, Gatt, Albert, DeMarco, Andrea, Borg, Claudia, van der Plas, Lonneke, Muscat, Amanda, Padovani, Ian
Maltese, the national language of Malta, is spoken by approximately 500,000 people. Speech processing for Maltese is still in its early stages of development. In this paper, we present the first spoken Maltese corpus designed purposely for Automatic
Externí odkaz:
http://arxiv.org/abs/2008.05760
In this paper, we provide a philosophical account of the value of creative systems for individuals and society. We characterize creativity in very broad philosophical terms, encompassing natural, existential, and social creative processes, such as na
Externí odkaz:
http://arxiv.org/abs/2007.11973
Autor:
Loi, Michele, van der Plas, Lonneke
With this paper, we aim to put an issue on the agenda of AI ethics that in our view is overlooked in the current discourse. The current discussions are dominated by topics suchas trustworthiness and bias, whereas the issue we like to focuson is count
Externí odkaz:
http://arxiv.org/abs/2006.11814
Autor:
Dhar, Prajit, van der Plas, Lonneke
We introduce temporally and contextually-aware models for the novel task of predicting unseen but plausible concepts, as conveyed by noun-noun compounds in a time-stamped corpus. We train compositional models on observed compounds, more specifically
Externí odkaz:
http://arxiv.org/abs/1906.03634