Building a common pipeline for rule-based document classification

Autor: Olga V, Patterson, Thomas, Ginter, Scott L, DuVall
Rok vydání: 2013
Předmět:
Zdroj: Studies in health technology and informatics. 192
ISSN: 1879-8365
Popis: Instance-based classification of clinical text is a widely used natural language processing task employed as a step for patient classification, document retrieval, or information extraction. Rule-based approaches rely on concept identification and context analysis in order to determine the appropriate class. We propose a five-step process that enables even small research teams to develop simple but powerful rule-based NLP systems by taking advantage of a common UIMA AS based pipeline for classification. Our proposed methodology coupled with the general-purpose solution provides researchers with access to the data locked in clinical text in cases of limited human resources and compact timelines.
Databáze: OpenAIRE