Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP.

Autor: Soysal E; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Warner JL; Department of Medicine, Vanderbilt University, Nashville, Tennessee.; Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee.; Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee., Wang J; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Jiang M; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Harvey K; Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee., Jain SK; Vanderbilt School of Medicine, Vanderbilt University, Nashville, Tennessee., Dong X; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Song HY; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Siddhanamatha H; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Wang L; Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, Minnesota., Dai Q; Department of Medicine, Vanderbilt University, Nashville, Tennessee., Chen Q; Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee., Du X; School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas., Tao C; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas., Yang P; Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, Minnesota., Denny JC; Department of Medicine, Vanderbilt University, Nashville, Tennessee.; Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee., Liu H; Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, Minnesota., Xu H; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas.
Jazyk: angličtina
Zdroj: Studies in health technology and informatics [Stud Health Technol Inform] 2019 Aug 21; Vol. 264, pp. 1041-1045.
DOI: 10.3233/SHTI190383
Abstrakt: Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.
Databáze: MEDLINE