Výsledky vyhledávání - "Rosenbaum, Andy"

Report

GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning

Autor: Namboori, Amani, Mangale, Shivam, Rosenbaum, Andy, Soltan, Saleh

The emergence of Large Language Models (LLMs) with capabilities like In-Context Learning (ICL) has ushered in new possibilities for data generation across various domains while minimizing the need for extensive data collection and modeling techniques

Externí odkaz: http://arxiv.org/abs/2404.09163

Zobrazit plný text záznamu

Report

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

Autor: Soltan, Saleh, Rosenbaum, Andy, Falke, Tobias, Lu, Qin, Rumshisky, Anna, Hamza, Wael

Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each have advantages, however training both model types from scratch is computationally expensive. We explore recipes to improve pre-training efficiency by initializing one model from

Externí odkaz: http://arxiv.org/abs/2306.08756

Zobrazit plný text záznamu

Report

PLACES: Prompting Language Models for Social Conversation Synthesis

Autor: Chen, Maximillian, Papangelis, Alexandros, Tao, Chenyang, Kim, Seokhwan, Rosenbaum, Andy, Liu, Yang, Yu, Zhou, Hakkani-Tur, Dilek

Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompti

Externí odkaz: http://arxiv.org/abs/2302.03269

Zobrazit plný text záznamu

Report

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Autor: Chen, Maximillian, Papangelis, Alexandros, Tao, Chenyang, Rosenbaum, Andy, Kim, Seokhwan, Liu, Yang, Yu, Zhou, Hakkani-Tur, Dilek

Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by p

Externí odkaz: http://arxiv.org/abs/2210.14169

Zobrazit plný text záznamu

Report

CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

Autor: Rosenbaum, Andy, Soltan, Saleh, Hamza, Wael, Saffari, Amir, Damonte, Marco, Groves, Isabel

A bottleneck to developing Semantic Parsing (SP) models is the need for a large volume of human-labeled training data. Given the complexity and cost of human annotation for SP, labeled data is often scarce, particularly in multilingual settings. Larg

Externí odkaz: http://arxiv.org/abs/2210.07074

Zobrazit plný text záznamu

Report

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

Autor: Rosenbaum, Andy, Soltan, Saleh, Hamza, Wael, Versley, Yannick, Boese, Markus

We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In

Externí odkaz: http://arxiv.org/abs/2209.09900

Zobrazit plný text záznamu

Report

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

Autor: Soltan, Saleh, Ananthakrishnan, Shankar, FitzGerald, Jack, Gupta, Rahul, Hamza, Wael, Khan, Haidar, Peris, Charith, Rawls, Stephen, Rosenbaum, Andy, Rumshisky, Anna, Prakash, Chandana Satya, Sridhar, Mukund, Triefenbach, Fabian, Verma, Apurv, Tur, Gokhan, Natarajan, Prem

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various

Externí odkaz: http://arxiv.org/abs/2208.01448

Zobrazit plný text záznamu

Report

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

Publikováno v: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the N

Externí odkaz: http://arxiv.org/abs/2206.07808

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání