Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking

Autor:	Liu, Jie, Wang, Wenxuan, Ma, Zizhan, Huang, Guolin, SU, Yihang, Chang, Kao-Jung, Chen, Wenting, Li, Haoliang, Shen, Linlin, Lyu, Michael
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language Computer Science - Artificial Intelligence
Druh dokumentu:	Working Paper
Popis:	Clinical decision making (CDM) is a complex, dynamic process crucial to healthcare delivery, yet it remains a significant challenge for artificial intelligence systems. While Large Language Model (LLM)-based agents have been tested on general medical knowledge using licensing exams and knowledge question-answering tasks, their performance in the CDM in real-world scenarios is limited due to the lack of comprehensive testing datasets that mirror actual medical practice. To address this gap, we present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow. MedChain distinguishes itself from existing benchmarks with three key features of real-world clinical practice: personalization, interactivity, and sequentiality. Further, to tackle real-world CDM challenges, we also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses. MedChain-Agent demonstrates remarkable adaptability in gathering information dynamically and handling sequential clinical tasks, significantly outperforming existing approaches. The relevant dataset and code will be released upon acceptance of this paper.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2412.01605 Zobrazit plný text záznamu View this record from Arxiv