Using Recurrent Neural Networks to Extract High-Quality Information From Lung Cancer Screening Computerized Tomography Reports for Inter-Radiologist Audit and Feedback Quality Improvement.
Autor: | Zhang Y; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada., Grant BMM; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada., Hope AJ; Radiation Medicine Program, Princess Margaret Cancer Centre, and Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada., Hung RJ; Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health Systems, Toronto, ON, Canada.; Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada., Warkentin MT; Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health Systems, Toronto, ON, Canada.; Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada., Lam ACL; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada., Aggawal R; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada., Xu M; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada., Shepherd FA; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada., Tsao MS; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Laboratory Medicine and Pathology, University Health Network, Toronto, ON, Canada., Xu W; Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.; Biostatistics, Princess Margaret Cancer Centre, Toronto, ON, Canada.; Computational Biology and Medicine Program, Princess Margaret Cancer Centre, Toronto, ON, Canada., Pakkal M; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Division of Cardiothoracic Imaging, Joint Department of Medical Imaging, Toronto General Hospital, Toronto, ON, Canada., Liu G; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, ON, Canada.; Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.; Biostatistics, Princess Margaret Cancer Centre, Toronto, ON, Canada.; Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada., McInnis MC; Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.; Division of Cardiothoracic Imaging, Joint Department of Medical Imaging, Toronto General Hospital, Toronto, ON, Canada. |
---|---|
Jazyk: | angličtina |
Zdroj: | JCO clinical cancer informatics [JCO Clin Cancer Inform] 2023 Mar; Vol. 7, pp. e2200153. |
DOI: | 10.1200/CCI.22.00153 |
Abstrakt: | Purpose: Lung cancer screening programs generate a high volume of low-dose computed tomography (LDCT) reports that contain valuable information, typically in a free-text format. High-performance named-entity recognition (NER) models can extract relevant information from these reports automatically for inter-radiologist quality control. Methods: Using LDCT report data from a longitudinal lung cancer screening program (8,305 reports; 3,124 participants; 2006-2019), we trained a rule-based model and two bidirectional long short-term memory (Bi-LSTM) NER neural network models to detect clinically relevant information from LDCT reports. Model performance was tested using F1 scores and compared with a published open-source radiology NER model (Stanza) in an independent evaluation set of 150 reports. The top performing model was applied to a data set of 6,948 reports for an inter-radiologist quality control assessment. Results: The best performing model, a Bi-LSTM NER recurrent neural network model, had an overall F1 score of 0.950, which outperformed Stanza (F1 score = 0.872) and a rule-based NER model (F1 score = 0.809). Recall (sensitivity) for the best Bi-LSTM model ranged from 0.916 to 0.991 for different entity types; precision (positive predictive value) ranged from 0.892 to 0.997. Test performance remained stable across time periods. There was an average of a 2.86-fold difference in the number of identified entities between the most and the least detailed radiologists. Conclusion: We built an open-source Bi-LSTM NER model that outperformed other open-source or rule-based radiology NER models. This model can efficiently extract clinically relevant information from lung cancer screening computerized tomography reports with high accuracy, enabling efficient audit and feedback to improve quality of patient care. |
Databáze: | MEDLINE |
Externí odkaz: |