Scalable relevance ranking algorithm via semantic similarity assessment improves efficiency of medical chart review

Autor:	Tianrun Cai, Zeling He, Chuan Hong, Yichi Zhang, Yuk-Lam Ho, Jacqueline Honerlaw, Alon Geva, Vidul Ayakulangara Panickan, Amanda King, David R Gagnon, Michael Gaziano, Kelly Cho, Katherine Liao, Tianxi Cai
Rok vydání:	2022
Předmět:	Phenotype Electronic Health Records Humans Health Informatics Precision Medicine Algorithms Natural Language Processing Semantics Computer Science Applications
Zdroj:	Journal of Biomedical Informatics. 132:104109
ISSN:	1532-0464
Popis:	Accurately assigning phenotype information to individual patients via computational phenotyping using Electronic Health Records (EHRs) has been seen as the first step towards enabling EHRs for precision medicine research. Chart review labels annotated by clinical experts, also known as "gold standard" labels, are essential for the development and validation of computational phenotyping algorithms. However, given the complexity of EHR systems, the process of chart review is both labor intensive and time consuming. We propose a fully automated algorithm, referred to as pGUESS, to rank EHR notes according to their relevance to a given phenotype. By identifying the most relevant notes, pGUESS can greatly improve the efficiency and accuracy of chart reviews.pGUESS uses prior guided semantic similarity to measure the informativeness of a clinical note to a given phenotype. We first select candidate clinical concepts from a pool of comprehensive medical concepts using public knowledge sources and then derive the semantic embedding vector (SEV) for a reference article (SEVThe algorithm was validated against four sets of 200 notes that were manually annotated by clinical experts to assess their informativeness to one of three disease phenotypes. pGUESS algorithm substantially outperforms existing unsupervised approaches for classifying the relevance status with respect to both accuracy and scalability across phenotypes. Averaging over the three phenotypes, the rank correlation between the algorithm ranking and gold standard label was 0.64 for pGUESS, but only 0.47 and 0.35 for the next two best performing algorithms. pGUESS is also much more computationally scalable compared to existing algorithms.pGUESS algorithm can substantially reduce the burden of chart review and holds potential in improving the efficiency and accuracy of human annotation.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f06aa0632ee3fc8fec9e5bd3e6a2902c https://doi.org/10.1016/j.jbi.2022.104109 Zobrazit plný text záznamu Full Text from ScienceDirect