Inclusion and Exclusion of Medical Codes for Primary Care Data Extraction

Autor: Shao Fen Liang, Vasa Curcin, Xiaohui Sun, Martin Gulliford
Rok vydání: 2018
Předmět:
Zdroj: ICHI
DOI: 10.1109/ichi.2018.00070
Popis: In the UK primary care research, defining a full set of medical codes for data analytics is a laborious and time-intensive task, due to the sizes and variety of coding systems in use in the UK. Our work aims to facilitate this process by developing a prototype using semi-automatic extraction approach to identify required codes within a specific medical terminology. We used Natural Language Processing techniques for tokenisation and substring extraction together with an expert interaction to filter unwanted medical concepts. Our approach has been tested with a clinical study aiming at extracting data for pneumonia of bacterial origin study. Our system has generated 147 READ clinical concepts, while a manual process has generated 103 concepts. Out of the additional 44 concepts generated by the system, 28 were missed by the analyst and 16 were false positives that needed to be excluded.
Databáze: OpenAIRE