Zobrazeno 1 - 10
of 18
pro vyhledávání: '"Martinez, Richard Diehl"'
Language models are typically trained on large corpora of text in their default orthographic form. However, this is not the only option; representing data as streams of phonemes can offer unique advantages, from deeper insights into phonological lang
Externí odkaz:
http://arxiv.org/abs/2410.22906
Curriculum Learning has been a popular strategy to improve the cognitive plausibility of Small-Scale Language Models (SSLMs) in the BabyLM Challenge. However, it has not led to considerable improvements over non-curriculum models. We assess whether t
Externí odkaz:
http://arxiv.org/abs/2410.22886
Language models strongly rely on frequency information because they maximize the likelihood of tokens during pre-training. As a consequence, language models tend to not generalize well to tokens that are seldom seen during training. Moreover, maximum
Externí odkaz:
http://arxiv.org/abs/2410.11462
Increasing the number of parameters in language models is a common strategy to enhance their performance. However, smaller language models remain valuable due to their lower operational costs. Despite their advantages, smaller models frequently under
Externí odkaz:
http://arxiv.org/abs/2410.11451
Autor:
Martinez, Richard Diehl, Goriely, Zebulon, McGovern, Hope, Davis, Christopher, Caines, Andrew, Buttery, Paula, Beinborn, Lisa
We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge. The challenge requires training a language model from scratch using only a relatively small training dataset of ten million words. We experiment with three variant
Externí odkaz:
http://arxiv.org/abs/2311.08886
Autor:
Martinez, Richard Diehl, Novotney, Scott, Bulyko, Ivan, Rastrow, Ariya, Stolcke, Andreas, Gandhe, Ankur
Language modeling (LM) for automatic speech recognition (ASR) does not usually incorporate utterance level contextual information. For some domains like voice assistants, however, additional context, such as the time at which an utterance was spoken,
Externí odkaz:
http://arxiv.org/abs/2106.01451
Autor:
Pryzant, Reid, Martinez, Richard Diehl, Dass, Nathan, Kurohashi, Sadao, Jurafsky, Dan, Yang, Diyi
Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes
Externí odkaz:
http://arxiv.org/abs/1911.09709
We present a method for a wine recommendation system that employs multidimensional clustering and unsupervised learning methods. Our algorithm first performs clustering on a large corpus of wine reviews. It then uses the resulting wine clusters as an
Externí odkaz:
http://arxiv.org/abs/1807.00692
In this paper, we examine the use case of general adversarial networks (GANs) in the field of marketing. In particular, we analyze how GAN models can replicate text patterns from successful product listings on Airbnb, a peer-to-peer online market for
Externí odkaz:
http://arxiv.org/abs/1806.11432
We introduce Ignition: an end-to-end neural network architecture for training unconstrained self-driving vehicles in simulated environments. The model is a ResNet-18 variant, which is fed in images from the front of a simulated F1 car, and outputs op
Externí odkaz:
http://arxiv.org/abs/1806.11349