Výsledky vyhledávání - "Florian, Radu"

Report

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Autor: Lee, Young-Suk, Gunasekara, Chulaka, Contractor, Danish, Astudillo, Ramón Fernandez, Florian, Radu

We introduce a technique for multi-document grounded multi-turn synthetic dialog generation that incorporates three main ideas. First, we control the overall dialog flow using taxonomy-driven user queries that are generated with Chain-of-Thought (CoT

Externí odkaz: http://arxiv.org/abs/2409.11500

Zobrazit plný text záznamu

Report

Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels

Autor: Xian, Jasper, Samuel, Saron, Khoubsirat, Faraz, Pradeep, Ronak, Sultan, Md Arafat, Florian, Radu, Roukos, Salim, Sil, Avirup, Potts, Christopher, Khattab, Omar

We develop a method for training small-scale (under 100M parameter) neural information retrieval models with as few as 10 gold relevance labels. The method depends on generating synthetic queries for documents using a language model (LM), and the key

Externí odkaz: http://arxiv.org/abs/2406.11706

Zobrazit plný text záznamu

Report

Can a Multichoice Dataset be Repurposed for Extractive Question Answering?

Autor: Lynn, Teresa, Altakrori, Malik H., Magdy, Samar Mohamed, Das, Rocktim Jyoti, Lyu, Chenyang, Nasr, Mohamed, Samih, Younes, Aji, Alham Fikri, Nakov, Preslav, Godbole, Shantanu, Roukos, Salim, Florian, Radu, Habash, Nizar

The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose impor

Externí odkaz: http://arxiv.org/abs/2404.17342

Zobrazit plný text záznamu

Report

CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems

Autor: Rosenthal, Sara, Sil, Avirup, Florian, Radu, Roukos, Salim

Retrieval Augmented Generation (RAG) has become a popular application for large language models. It is preferable that successful RAG systems provide accurate answers that are supported by being grounded in a passage without any hallucinations. While

Externí odkaz: http://arxiv.org/abs/2404.02103

Zobrazit plný text záznamu

Report

Self-Refinement of Language Models from External Proxy Metrics Feedback

Autor: Ramji, Keshav, Lee, Young-Suk, Astudillo, Ramón Fernandez, Sultan, Md Arafat, Naseem, Tahira, Munawar, Asim, Florian, Radu, Roukos, Salim

It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being gro

Externí odkaz: http://arxiv.org/abs/2403.00827

Zobrazit plný text záznamu

Report

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Autor: Lee, Young-Suk, Sultan, Md Arafat, El-Kurdi, Yousef, Munawar, Tahira Naseem Asim, Florian, Radu, Roukos, Salim, Astudillo, Ramón Fernandez

Publikováno v: EMNLP 2023

Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitatio

Externí odkaz: http://arxiv.org/abs/2310.13961

Zobrazit plný text záznamu

Report

Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document AMR Parsing

Autor: Kumaravel, Sadhana, Naseem, Tahira, Astudillo, Ramon Fernandez, Florian, Radu, Roukos, Salim

The sliding window approach provides an elegant way to handle contexts of sizes larger than the Transformer's input window, for tasks like language modeling. Here we extend this approach to the sequence-to-sequence task of document parsing. For this,

Externí odkaz: http://arxiv.org/abs/2305.17273

Zobrazit plný text záznamu

Report

AMR Parsing with Instruction Fine-tuned Pre-trained Language Models

Autor: Lee, Young-Suk, Astudillo, Ramón Fernandez, Florian, Radu, Naseem, Tahira, Roukos, Salim

Instruction fine-tuned language models on a collection of instruction annotated datasets (FLAN) have shown highly effective to improve model performance and generalization to unseen tasks. However, a majority of standard parsing tasks including abstr

Externí odkaz: http://arxiv.org/abs/2304.12272

Zobrazit plný text záznamu

Report

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Autor: Saad-Falcon, Jon, Khattab, Omar, Santhanam, Keshav, Florian, Radu, Franz, Martin, Roukos, Salim, Sil, Avirup, Sultan, Md Arafat, Potts, Christopher

Many information retrieval tasks require large labeled datasets for fine-tuning. However, such datasets are often unavailable, and their utility for real-world applications can diminish quickly due to domain shifts. To address this challenge, we deve

Externí odkaz: http://arxiv.org/abs/2303.00807

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání