Popis: |
The record of what occurred during a surgical procedure is typically represented in the electronic health record as a combination of unstructured text blocks (the operative report) with limited associated structured data. Billing codes fail to account for significant variance in procedures, thus although much of this information is valuable for real-time patient safety interventions, it is infrequently available for automated analysis. The selection of an appropriate ontological model provides a good foundation for effective information extraction and knowledge representation, allowing high quality inference and knowledge based concept identification. Through gap analysis and statistical analysis of the content of a corpus of operative notes, SNOMED CT has been selected as the most approriate knowledge model for automated information extraction in this domain. To successfully apply statistical natural language processing (NLP) methods developed on one corpus to another type of text, one must assume that there is a sufficient degree of similarity between the texts, both syntactically and semantically. From this, a determination is drawn as to the applicability of existing NLP clinical tools to the operative report. General clinical text was found to be not representative of the writing observed in operative reports. From this theoretical foundation, text classifiers were developed to demonstrate the feasibility of automatically encoding a subset of SNOMED CT terms in operative reports. Classification performance was high for detection of surgical specialty and open or closed procedures (f-score 0.965, 0.931 respectively); however, the detection of laterality was more reliable through heuristic methods. |