A framework enabling LLMs into regulatory environment for transparency and trustworthiness and its application to drug labeling document.

Autor: Wu L; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA. Electronic address: Leihong.wu@fda.hhs.gov., Xu J; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA., Thakkar S; Office of Translational Sciences, Center for Drug Evaluation and Research (CDER), US FDA, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA., Gray M; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA., Qu Y; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA., Li D; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA., Tong W; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US FDA, 3900 NCTR Rd, Jefferson AR, 72211, USA. Electronic address: Weida.tong@fda.hhs.gov.
Jazyk: angličtina
Zdroj: Regulatory toxicology and pharmacology : RTP [Regul Toxicol Pharmacol] 2024 May; Vol. 149, pp. 105613. Date of Electronic Publication: 2024 Apr 02.
DOI: 10.1016/j.yrtph.2024.105613
Abstrakt: Regulatory agencies consistently deal with extensive document reviews, ranging from product submissions to both internal and external communications. Large Language Models (LLMs) like ChatGPT can be invaluable tools for these tasks, however present several challenges, particularly the proprietary information, combining customized function with specific review needs, and transparency and explainability of the model's output. Hence, a localized and customized solution is imperative. To tackle these challenges, we formulated a framework named askFDALabel on FDA drug labeling documents that is a crucial resource in the FDA drug review process. AskFDALabel operates within a secure IT environment and comprises two key modules: a semantic search and a Q&A/text-generation module. The Module S built on word embeddings to enable comprehensive semantic queries within labeling documents. The Module T utilizes a tuned LLM to generate responses based on references from Module S. As the result, our framework enabled small LLMs to perform comparably to ChatGPT with as a computationally inexpensive solution for regulatory application. To conclude, through AskFDALabel, we have showcased a pathway that harnesses LLMs to support agency operations within a secure environment, offering tailored functions for the needs of regulatory research.
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Published by Elsevier Inc.)
Databáze: MEDLINE