Výsledky vyhledávání

Report

Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling

Autor: Zampella, Maria, Otamendi, Urtzi, Belaunzaran, Xabier, Artetxe, Arkaitz, Olaizola, Igor G., Longo, Giuseppe, Sierra, Basilio

Scheduling problems pose significant challenges in resource, industry, and operational management. This paper addresses the Unrelated Parallel Machine Scheduling Problem (UPMS) with setup times and resources using a Multi-Agent Reinforcement Learning

Externí odkaz: http://arxiv.org/abs/2411.07634

Zobrazit plný text záznamu

Report

Linguini: A benchmark for language-agnostic linguistic reasoning

Autor: Sánchez, Eduardo, Alastruey, Belen, Ropers, Christophe, Stenetorp, Pontus, Artetxe, Mikel, Costa-jussà, Marta R.

We propose a new benchmark to measure a language model's linguistic reasoning skills without relying on pre-existing language-specific knowledge. The test covers 894 questions grouped in 160 problems across 75 (mostly) extremely low-resource language

Externí odkaz: http://arxiv.org/abs/2409.12126

Zobrazit plný text záznamu

Report

BertaQA: How Much Do Language Models Know About Local Culture?

Autor: Etxaniz, Julen, Azkune, Gorka, Soroa, Aitor, de Lacalle, Oier Lopez, Artetxe, Mikel

Large Language Models (LLMs) exhibit extensive knowledge about the world, but most evaluations have been limited to global or anglocentric subjects. This raises the question of how well these models perform on topics relevant to other cultures, whose

Externí odkaz: http://arxiv.org/abs/2406.07302

Zobrazit plný text záznamu

Report

Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models

We introduce Vibe-Eval: a new open benchmark and framework for evaluating multimodal chat models. Vibe-Eval consists of 269 visual understanding prompts, including 100 of hard difficulty, complete with gold-standard responses authored by experts. Vib

Externí odkaz: http://arxiv.org/abs/2405.02287

Zobrazit plný text záznamu

Report

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio inputs. This technical report discusses details of t

Externí odkaz: http://arxiv.org/abs/2404.12387

Zobrazit plný text záznamu

Report

Latxa: An Open Language Model and Evaluation Suite for Basque

Autor: Etxaniz, Julen, Sainz, Oscar, Perez, Naiara, Aldabe, Itziar, Rigau, German, Agirre, Eneko, Ormazabal, Aitor, Artetxe, Mikel, Soroa, Aitor

Publikováno v: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14952--14972. 2024

We introduce Latxa, a family of large language models for Basque ranging from 7 to 70 billion parameters. Latxa is based on Llama 2, which we continue pretraining on a new Basque corpus comprising 4.3M documents and 4.2B tokens. Addressing the scarci

Externí odkaz: http://arxiv.org/abs/2403.20266

Zobrazit plný text záznamu

Report

Gender-specific Machine Translation with Large Language Models

Autor: Sánchez, Eduardo, Andrews, Pierre, Stenetorp, Pontus, Artetxe, Mikel, Costa-jussà, Marta R.

While machine translation (MT) systems have seen significant improvements, it is still common for translations to reflect societal biases, such as gender bias. Decoder-only Large Language Models (LLMs) have demonstrated potential in MT, albeit with p

Externí odkaz: http://arxiv.org/abs/2309.03175

Zobrazit plný text záznamu

Report

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

Autor: Bandarkar, Lucas, Liang, Davis, Muller, Benjamin, Artetxe, Mikel, Shukla, Satya Narayan, Husa, Donald, Goyal, Naman, Krishnan, Abhinandan, Zettlemoyer, Luke, Khabsa, Madian

Publikováno v: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics 749-775 2024

We present Belebele, a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. Significantly expanding the language coverage of natural language understanding (NLU) benchmarks, this dataset enables the evaluation o

Externí odkaz: http://arxiv.org/abs/2308.16884

Zobrazit plný text záznamu

Report

Evaluation of Faithfulness Using the Longest Supported Subsequence

Autor: Mittal, Anirudh, Schick, Timo, Artetxe, Mikel, Dwivedi-Yu, Jane

As increasingly sophisticated language models emerge, their trustworthiness becomes a pivotal issue, especially in tasks such as summarization and question-answering. Ensuring their responses are contextually grounded and faithful is challenging due

Externí odkaz: http://arxiv.org/abs/2308.12157

Zobrazit plný text záznamu

Report

Do Multilingual Language Models Think Better in English?

Autor: Etxaniz, Julen, Azkune, Gorka, Soroa, Aitor, de Lacalle, Oier Lopez, Artetxe, Mikel

Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system, and running inference over the translated input.

Externí odkaz: http://arxiv.org/abs/2308.01223

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání