Inclusion and Exclusion of Medical Codes for Primary Care Data Extraction
Autor: | Shao Fen Liang, Vasa Curcin, Xiaohui Sun, Martin Gulliford |
---|---|
Rok vydání: | 2018 |
Předmět: |
education.field_of_study
Information retrieval Medical terminology 020205 medical informatics Computer science 02 engineering and technology Medical classification Substring Terminology 03 medical and health sciences 0302 clinical medicine Data extraction Read codes 0202 electrical engineering electronic engineering information engineering False positive paradox Task analysis 030212 general & internal medicine education |
Zdroj: | ICHI |
DOI: | 10.1109/ichi.2018.00070 |
Popis: | In the UK primary care research, defining a full set of medical codes for data analytics is a laborious and time-intensive task, due to the sizes and variety of coding systems in use in the UK. Our work aims to facilitate this process by developing a prototype using semi-automatic extraction approach to identify required codes within a specific medical terminology. We used Natural Language Processing techniques for tokenisation and substring extraction together with an expert interaction to filter unwanted medical concepts. Our approach has been tested with a clinical study aiming at extracting data for pneumonia of bacterial origin study. Our system has generated 147 READ clinical concepts, while a manual process has generated 103 concepts. Out of the additional 44 concepts generated by the system, 28 were missed by the analyst and 16 were false positives that needed to be excluded. |
Databáze: | OpenAIRE |
Externí odkaz: |