Fact Finder -- Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs

Autor: Steinigen, Daniel, Teucher, Roman, Ruland, Timm Heine, Rudat, Max, Flores-Herr, Nicolas, Fischer, Peter, Milosevic, Nikola, Schymura, Christopher, Ziletti, Angelo
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: Recent advancements in Large Language Models (LLMs) have showcased their proficiency in answering natural language queries. However, their effectiveness is hindered by limited domain-specific knowledge, raising concerns about the reliability of their responses. We introduce a hybrid system that augments LLMs with domain-specific knowledge graphs (KGs), thereby aiming to enhance factual correctness using a KG-based retrieval approach. We focus on a medical KG to demonstrate our methodology, which includes (1) pre-processing, (2) Cypher query generation, (3) Cypher query processing, (4) KG retrieval, and (5) LLM-enhanced response generation. We evaluate our system on a curated dataset of 69 samples, achieving a precision of 78\% in retrieving correct KG nodes. Our findings indicate that the hybrid system surpasses a standalone LLM in accuracy and completeness, as verified by an LLM-as-a-Judge evaluation method. This positions the system as a promising tool for applications that demand factual correctness and completeness, such as target identification -- a critical process in pinpointing biological entities for disease treatment or crop enhancement. Moreover, its intuitive search interface and ability to provide accurate responses within seconds make it well-suited for time-sensitive, precision-focused research contexts. We publish the source code together with the dataset and the prompt templates used.
Comment: 10 pages, 7 figures
Databáze: arXiv