Výsledky vyhledávání - "Astudillo, Ramon"

Report

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Autor: Lee, Young-Suk, Gunasekara, Chulaka, Contractor, Danish, Astudillo, Ramón Fernandez, Florian, Radu

We introduce a technique for multi-document grounded multi-turn synthetic dialog generation that incorporates three main ideas. First, we control the overall dialog flow using taxonomy-driven user queries that are generated with Chain-of-Thought (CoT

Externí odkaz: http://arxiv.org/abs/2409.11500

Zobrazit plný text záznamu

Report

The Future of Open Human Feedback

Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by

Externí odkaz: http://arxiv.org/abs/2408.16961

Zobrazit plný text záznamu

Report

Self-Refinement of Language Models from External Proxy Metrics Feedback

Autor: Ramji, Keshav, Lee, Young-Suk, Astudillo, Ramón Fernandez, Sultan, Md Arafat, Naseem, Tahira, Munawar, Asim, Florian, Radu, Roukos, Salim

It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being gro

Externí odkaz: http://arxiv.org/abs/2403.00827

Zobrazit plný text záznamu

Report

Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations

Autor: Sultan, Md Arafat, Ganhotra, Jatin, Astudillo, Ramón Fernandez

We introduce a structured chain-of-thought (SCoT) prompting approach to generating content-grounded multi-turn question-answer conversations using a pre-trained large language model (LLM). At the core of our proposal is a structured breakdown of the

Externí odkaz: http://arxiv.org/abs/2402.11770

Zobrazit plný text záznamu

Report

BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

Autor: Pandey, Gaurav, Nandwani, Yatin, Naseem, Tahira, Mishra, Mayank, Xu, Guangxuan, Raghu, Dinesh, Joshi, Sachindra, Munawar, Asim, Astudillo, Ramón Fernandez

Distribution matching methods for language model alignment such as Generation with Distributional Control (GDC) and Distributional Policy Gradient (DPG) have not received the same level of attention in reinforcement learning from human feedback (RLHF

Externí odkaz: http://arxiv.org/abs/2402.02479

Zobrazit plný text záznamu

Report

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Autor: Lee, Young-Suk, Sultan, Md Arafat, El-Kurdi, Yousef, Munawar, Tahira Naseem Asim, Florian, Radu, Roukos, Salim, Astudillo, Ramón Fernandez

Publikováno v: EMNLP 2023

Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitatio

Externí odkaz: http://arxiv.org/abs/2310.13961

Zobrazit plný text záznamu

Report

Formally Specifying the High-Level Behavior of LLM-Based Agents

Autor: Crouse, Maxwell, Abdelaziz, Ibrahim, Astudillo, Ramon, Basu, Kinjal, Dan, Soham, Kumaravel, Sadhana, Fokoue, Achille, Kapanipathi, Pavan, Roukos, Salim, Lastras, Luis

Autonomous, goal-driven agents powered by LLMs have recently emerged as promising tools for solving challenging problems without the need for task-specific finetuned models that can be expensive to procure. Currently, the design and implementation of

Externí odkaz: http://arxiv.org/abs/2310.08535

Zobrazit plný text záznamu

Report

Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

Autor: Crouse, Maxwell, Astudillo, Ramon, Naseem, Tahira, Chaudhury, Subhajit, Kapanipathi, Pavan, Roukos, Salim, Gray, Alexander

We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is use

Externí odkaz: http://arxiv.org/abs/2305.20018

Zobrazit plný text záznamu

Report

Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document AMR Parsing

Autor: Kumaravel, Sadhana, Naseem, Tahira, Astudillo, Ramon Fernandez, Florian, Radu, Roukos, Salim

The sliding window approach provides an elegant way to handle contexts of sizes larger than the Transformer's input window, for tasks like language modeling. Here we extend this approach to the sequence-to-sequence task of document parsing. For this,

Externí odkaz: http://arxiv.org/abs/2305.17273

Zobrazit plný text záznamu

Report

Laziness Is a Virtue When It Comes to Compositionality in Neural Semantic Parsing

Autor: Crouse, Maxwell, Kapanipathi, Pavan, Chaudhury, Subhajit, Naseem, Tahira, Astudillo, Ramon, Fokoue, Achille, Klinger, Tim

Nearly all general-purpose neural semantic parsers generate logical forms in a strictly top-down autoregressive fashion. Though such systems have achieved impressive results across a variety of datasets and domains, recent works have called into ques

Externí odkaz: http://arxiv.org/abs/2305.04346

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání