Mining clinical phrases from nursing notes to discover risk factors of patient deterioration
Autor: | Kenrick Cato, Jose P. Garcia, Haomiao Jia, Min-Jeoung Kang, Zfania Tom Korach, Jie Yang, Christopher Knaplund, Sarah Collins Rossetti, Jessica M. Schwartz, Kumiko O. Schnock, Li Zhou |
---|---|
Rok vydání: | 2020 |
Předmět: |
Adult
Male Phrase 020205 medical informatics Computer science Judgement Nurses Health Informatics 02 engineering and technology computer.software_genre Health informatics Article 03 medical and health sciences 0302 clinical medicine Risk Factors 0202 electrical engineering electronic engineering information engineering Data Mining Electronic Health Records Humans 030212 general & internal medicine Aged Natural Language Processing Event (probability theory) Proportional hazards model business.industry Rank (computer programming) Middle Aged Risk factor (computing) Identification (information) Female Artificial intelligence business computer Natural language processing |
Zdroj: | Int J Med Inform |
ISSN: | 1386-5056 |
DOI: | 10.1016/j.ijmedinf.2019.104053 |
Popis: | Objective Early identification and treatment of patient deterioration is crucial to improving clinical outcomes. To act, hospital rapid response (RR) teams often rely on nurses’ clinical judgement typically documented narratively in the electronic health record (EHR). We developed a data-driven, unsupervised method to discover potential risk factors of RR events from nursing notes. Methods We applied multiple natural language processing methods, including language modelling, word embeddings, and two phrase mining methods (TextRank and NC-Value), to identify quality phrases that represent clinical entities from unannotated nursing notes. TextRank was used to determine the important word-sequences in each note. NC-Value was then used to globally rank the locally-important sequences across the whole corpus. We evaluated our method both on its accuracy compared to human judgement and on the ability of the mined phrases to predict a clinical outcome, RR event hazard. Results When applied to 61,740 hospital encounters with 1,067 RR events and 778,955 notes, our method achieved an average precision of 0.590 to 0.764 (when excluding numeric tokens). Time-dependent covariates Cox model using the phrases achieved a concordance index of 0.739. Clustering the phrases revealed clinical concepts significantly associated with RR event hazard. Discussion Our findings demonstrate that our minimal-annotation, unsurprised method can rapidly mine quality phrases from a large amount of nursing notes, and these identified phrases are useful for downstream tasks, such as clinical outcome predication and risk factor identification. |
Databáze: | OpenAIRE |
Externí odkaz: |