Výsledky vyhledávání - "van der Plas, Lonneke"

Report

Understanding the effects of language-specific class imbalance in multilingual fine-tuning

Autor: Jung, Vincent, van der Plas, Lonneke

We study the effect of one type of imbalance often present in real-life multilingual classification datasets: an uneven distribution of labels across languages. We show evidence that fine-tuning a transformer-based Large Language Model (LLM) on a dat

Externí odkaz: http://arxiv.org/abs/2402.13016

Zobrazit plný text záznamu

Report

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

Autor: Petersen, Molly R., van der Plas, Lonneke

While analogies are a common way to evaluate word embeddings in NLP, it is also of interest to investigate whether or not analogical reasoning is a task in itself that can be learned. In this paper, we test several ways to learn basic analogical reas

Externí odkaz: http://arxiv.org/abs/2310.05597

Zobrazit plný text záznamu

Report

Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese

Autor: Micallef, Kurt, Gatt, Albert, Tanti, Marc, van der Plas, Lonneke, Borg, Claudia

Multilingual language models such as mBERT have seen impressive cross-lingual transfer to a variety of languages, but many languages remain excluded from these models. In this paper, we analyse the effect of pre-training with monolingual data for a l

Externí odkaz: http://arxiv.org/abs/2205.10517

Zobrazit plný text záznamu

Report

Analysis of Data Augmentation Methods for Low-Resource Maltese ASR

Autor: DeMarco, Andrea, Mena, Carlos, Gatt, Albert, Borg, Claudia, Williams, Aiden, van der Plas, Lonneke

Recent years have seen an increased interest in the computational speech processing of Maltese, but resources remain sparse. In this paper, we consider data augmentation techniques for improving speech recognition for low-resource languages, focusing

Externí odkaz: http://arxiv.org/abs/2111.07793

Zobrazit plný text záznamu

Report

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Autor: Tanti, Marc, van der Plas, Lonneke, Borg, Claudia, Gatt, Albert

Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks

Externí odkaz: http://arxiv.org/abs/2109.06935

Zobrazit plný text záznamu

Report

Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis

Autor: Assimakopoulos, Stavros, Muskat, Rebecca Vella, van der Plas, Lonneke, Gatt, Albert

This paper presents a novel scheme for the annotation of hate speech in corpora of Web 2.0 commentary. The proposed scheme is motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTI

Externí odkaz: http://arxiv.org/abs/2008.06222

Zobrazit plný text záznamu

Report

MASRI-HEADSET: A Maltese Corpus for Speech Recognition

Autor: Mena, Carlos, Gatt, Albert, DeMarco, Andrea, Borg, Claudia, van der Plas, Lonneke, Muscat, Amanda, Padovani, Ian

Maltese, the national language of Malta, is spoken by approximately 500,000 people. Speech processing for Maltese is still in its early stages of development. In this paper, we present the first spoken Maltese corpus designed purposely for Automatic

Externí odkaz: http://arxiv.org/abs/2008.05760

Zobrazit plný text záznamu

Report

The societal and ethical relevance of computational creativity

Autor: Loi, Michele, Viganò, Eleonora, van der Plas, Lonneke

In this paper, we provide a philosophical account of the value of creative systems for individuals and society. We characterize creativity in very broad philosophical terms, encompassing natural, existential, and social creative processes, such as na

Externí odkaz: http://arxiv.org/abs/2007.11973

Zobrazit plný text záznamu

Report

A blindspot of AI ethics: anti-fragility in statistical prediction

Autor: Loi, Michele, van der Plas, Lonneke

With this paper, we aim to put an issue on the agenda of AI ethics that in our view is overlooked in the current discourse. The current discussions are dominated by topics suchas trustworthiness and bias, whereas the issue we like to focuson is count

Externí odkaz: http://arxiv.org/abs/2006.11814

Zobrazit plný text záznamu

Report

Learning to Predict Novel Noun-Noun Compounds

Autor: Dhar, Prajit, van der Plas, Lonneke

We introduce temporally and contextually-aware models for the novel task of predicting unseen but plausible concepts, as conveyed by noun-noun compounds in a time-stamped corpus. We train compositional models on observed compounds, more specifically

Externí odkaz: http://arxiv.org/abs/1906.03634

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání