Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children

Autor: Lilja Øvrelid, Tore Gundersen, Haldor Husby, Fredrik A. Dahl, Øystein Nytrø, Pål H. Brekke, Petter Hurlen, Taraka Rama
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Adult
Reproducibility of results
medicine.medical_specialty
Computer science
Health Informatics
X-ray computed
Norwegian
computer.software_genre
lcsh:Computer applications to medicine. Medical informatics
Convolutional neural network
030218 nuclear medicine & medical imaging
03 medical and health sciences
0302 clinical medicine
Machine learning
medicine
Humans
Patient group
Child
Tomography
Reliability (statistics)
business.industry
Health Policy
Natural language processing
language.human_language
Computer Science Applications
Data set
Support vector machine
Radiography
030220 oncology & carcinogenesis
language
lcsh:R858-859.7
Artificial intelligence
Radiology
Neural Networks
Computer

business
Tomography
X-Ray Computed

Quality assurance
Recurrent neural network model
computer
Research Article
Zdroj: BMC Medical Informatics and Decision Making, Vol 21, Iss 1, Pp 1-8 (2021)
BMC Medical Informatics and Decision Making
ISSN: 1472-6947
Popis: Background With a motivation of quality assurance, machine learning techniques were trained to classify Norwegian radiology reports of paediatric CT examinations according to their description of abnormal findings. Methods 13.506 reports from CT-scans of children, 1000 reports from CT scan of adults and 1000 reports from X-ray examination of adults were classified as positive or negative by a radiologist, according to the presence of abnormal findings. Inter-rater reliability was evaluated by comparison with a clinician’s classifications of 500 reports. Test–retest reliability of the radiologist was performed on the same 500 reports. A convolutional neural network model (CNN), a bidirectional recurrent neural network model (bi-LSTM) and a support vector machine model (SVM) were trained on a random selection of the children’s data set. Models were evaluated on the remaining CT-children reports and the adult data sets. Results Test–retest reliability: Cohen’s Kappa = 0.86 and F1 = 0.919. Inter-rater reliability: Kappa = 0.80 and F1 = 0.885. Model performances on the Children-CT data were as follows. CNN: (AUC = 0.981, F1 = 0.930), bi-LSTM: (AUC = 0.978, F1 = 0.927), SVM: (AUC = 0.975, F1 = 0.912). On the adult data sets, the models had AUC around 0.95 and F1 around 0.91. Conclusions The models performed close to perfectly on its defined domain, and also performed convincingly on reports pertaining to a different patient group and a different modality. The models were deemed suitable for classifying radiology reports for future quality assurance purposes, where the fraction of the examinations with abnormal findings for different sub-groups of patients is a parameter of interest.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje