Autor: |
Gundlapalli, A. V., Carter, M. E., Palmer, M., Ginter, T., Redd, A., Pickard, S., Shen, S., South, B., Divita, G., Scott DuVall, Nguyen, T. M., D Avolio, L. W., Samore, M. |
Jazyk: |
angličtina |
Rok vydání: |
2013 |
Předmět: |
|
Zdroj: |
Scopus-Elsevier |
Popis: |
Information retrieval algorithms based on natural language processing (NLP) of the free text of medical records have been used to find documents of interest from databases. Homelessness is a high priority non-medical diagnosis that is noted in electronic medical records of Veterans in Veterans Affairs (VA) facilities. Using a human-reviewed reference standard corpus of clinical documents of Veterans with evidence of homelessness and those without, an open-source NLP tool (Automated Retrieval Console v2.0, ARC) was trained to classify documents. The best performing model based on document level work-flow performed well on a test set (Precision 94%, Recall 97%, F-Measure 96). Processing of a naïve set of 10,000 randomly selected documents from the VA using this best performing model yielded 463 documents flagged as positive, indicating a 4.7% prevalence of homelessness. Human review noted a precision of 70% for these flags resulting in an adjusted prevalence of homelessness of 3.3% which matches current VA estimates. Further refinements are underway to improve the performance. We demonstrate an effective and rapid lifecycle of using an off-the-shelf NLP tool for screening targets of interest from medical records. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|