Towards a Robust Retrieval-Based Summarization System

Autor:	Liu, Shengjie, Wu, Jing, Bao, Jingyuan, Wang, Wenyi, Hovakimyan, Naira, Healey, Christopher G
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language Computer Science - Artificial Intelligence Computer Science - Information Retrieval Computer Science - Machine Learning
Druh dokumentu:	Working Paper
Popis:	This paper describes an investigation of the robustness of large language models (LLMs) for retrieval augmented generation (RAG)-based summarization tasks. While LLMs provide summarization capabilities, their performance in complex, real-world scenarios remains under-explored. Our first contribution is LogicSumm, an innovative evaluation framework incorporating realistic scenarios to assess LLM robustness during RAG-based summarization. Based on limitations identified by LogiSumm, we then developed SummRAG, a comprehensive system to create training dialogues and fine-tune a model to enhance robustness within LogicSumm's scenarios. SummRAG is an example of our goal of defining structured methods to test the capabilities of an LLM, rather than addressing issues in a one-off fashion. Experimental results confirm the power of SummRAG, showcasing improved logical coherence and summarization quality. Data, corresponding model weights, and Python code are available online.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2403.19889 Zobrazit plný text záznamu View this record from Arxiv