Autor: |
Yucheng Zhang, Benjamin M.M. Grant, Andrew J. Hope, Rayjean J. Hung, Matthew T. Warkentin, Andrew C.L. Lam, Reenika Aggawal, Maria Xu, Frances A. Shepherd, Ming-Sound Tsao, Wei Xu, Mini Pakkal, Geoffrey Liu, Micheal C. McInnis |
Rok vydání: |
2023 |
Předmět: |
|
Zdroj: |
JCO Clinical Cancer Informatics. |
ISSN: |
2473-4276 |
DOI: |
10.1200/cci.22.00153 |
Popis: |
PURPOSE Lung cancer screening programs generate a high volume of low-dose computed tomography (LDCT) reports that contain valuable information, typically in a free-text format. High-performance named-entity recognition (NER) models can extract relevant information from these reports automatically for inter-radiologist quality control. METHODS Using LDCT report data from a longitudinal lung cancer screening program (8,305 reports; 3,124 participants; 2006-2019), we trained a rule-based model and two bidirectional long short-term memory (Bi-LSTM) NER neural network models to detect clinically relevant information from LDCT reports. Model performance was tested using F1 scores and compared with a published open-source radiology NER model (Stanza) in an independent evaluation set of 150 reports. The top performing model was applied to a data set of 6,948 reports for an inter-radiologist quality control assessment. RESULTS The best performing model, a Bi-LSTM NER recurrent neural network model, had an overall F1 score of 0.950, which outperformed Stanza (F1 score = 0.872) and a rule-based NER model (F1 score = 0.809). Recall (sensitivity) for the best Bi-LSTM model ranged from 0.916 to 0.991 for different entity types; precision (positive predictive value) ranged from 0.892 to 0.997. Test performance remained stable across time periods. There was an average of a 2.86-fold difference in the number of identified entities between the most and the least detailed radiologists. CONCLUSION We built an open-source Bi-LSTM NER model that outperformed other open-source or rule-based radiology NER models. This model can efficiently extract clinically relevant information from lung cancer screening computerized tomography reports with high accuracy, enabling efficient audit and feedback to improve quality of patient care. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|