Výsledky vyhledávání - "Vilar, David"

Report

Introducing the NewsPaLM MBR and QE Dataset: LLM-Generated High-Quality Parallel Data Outperforms Traditional Web-Crawled Data

Autor: Finkelstein, Mara, Vilar, David, Freitag, Markus

Recent research in neural machine translation (NMT) has shown that training on high-quality machine-generated data can outperform training on human-generated data. This work accompanies the first-ever release of a LLM-generated, MBR-decoded and QE-re

Externí odkaz: http://arxiv.org/abs/2408.06537

Zobrazit plný text záznamu

Report

Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms

Autor: Trabelsi, Firas, Vilar, David, Finkelstein, Mara, Freitag, Markus

Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding

Externí odkaz: http://arxiv.org/abs/2406.02832

Zobrazit plný text záznamu

Report

There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

Autor: Peter, Jan-Thorsten, Vilar, David, Deutsch, Daniel, Finkelstein, Mara, Juraska, Juraj, Freitag, Markus

Quality Estimation (QE), the evaluation of machine translation output without the need of explicit references, has seen big improvements in the last years with the use of neural metrics. In this paper we analyze the viability of using QE metrics for

Externí odkaz: http://arxiv.org/abs/2311.05350

Zobrazit plný text záznamu

Report

Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model

Autor: Tomani, Christian, Vilar, David, Freitag, Markus, Cherry, Colin, Naskar, Subhajit, Finkelstein, Mara, Garcia, Xavier, Cremers, Daniel

Maximum-a-posteriori (MAP) decoding is the most widely used decoding strategy for neural machine translation (NMT) models. The underlying assumption is that model probability correlates well with human judgment, with better translations getting assig

Externí odkaz: http://arxiv.org/abs/2310.06707

Zobrazit plný text záznamu

Report

Prompting PaLM for Translation: Assessing Strategies and Performance

Autor: Vilar, David, Freitag, Markus, Cherry, Colin, Luo, Jiaming, Ratnakar, Viresh, Foster, George

Large language models (LLMs) that have been trained on multilingual but not parallel text exhibit a remarkable ability to translate between languages. We probe this ability in an in-depth study of the pathways language model (PaLM), which has demonst

Externí odkaz: http://arxiv.org/abs/2211.09102

Zobrazit plný text záznamu

Report

Scaling Up Influence Functions

Autor: Schioppa, Andrea, Zablotskaia, Polina, Vilar, David, Sokolov, Artem

We address efficient calculation of influence functions for tracking predictions back to the training data. We propose and analyze a new approach to speeding up the inverse Hessian calculation based on Arnoldi iteration. With this improvement, we ach

Externí odkaz: http://arxiv.org/abs/2112.03052

Zobrazit plný text záznamu

Report

Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

Autor: Kreutzer, Julia, Vilar, David, Sokolov, Artem

Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e.g. containing contents from multiple domains or different levels of quality or complexity. Naturally, these facets do no

Externí odkaz: http://arxiv.org/abs/2110.06997

Zobrazit plný text záznamu

Report

The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Autor: Domhan, Tobias, Denkowski, Michael, Vilar, David, Niu, Xing, Hieber, Felix, Heafield, Kenneth

We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit. New features include a simplified code base through the use of MXNet's Gluon API, a focus on state of the art model architectures, dis

Externí odkaz: http://arxiv.org/abs/2008.04885

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání