Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.

Autor: Sunghwan Sohn1 sohn.sunghwan@mayo.edu, Yanshan Wang1, Chung-Il Wi2, Krusemark, Elizabeth A.2, Euijung Ryu1, Ali, Mir H.3, Juhn, Young J.2, Hongfang Liu1, Sohn, Sunghwan4 (AUTHOR), Wang, Yanshan4 (AUTHOR), Wi, Chung-Il5 (AUTHOR), Ryu, Euijung4 (AUTHOR), Liu, Hongfang4 (AUTHOR)
Předmět:
Zdroj: Journal of the American Medical Informatics Association. Mar2018, Vol. 25 Issue 3, p353-359. 7p. 1 Color Photograph, 1 Black and White Photograph, 3 Charts, 3 Graphs.
Abstrakt: Objective: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability.Materials and Methods: Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement.Results: There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH.Discussion: The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. [ABSTRACT FROM AUTHOR]
Databáze: Library, Information Science & Technology Abstracts