Development and validation of deep learning and BERT models for classification of lung cancer radiology reports

Autor: S. Mithun, Ashish Kumar Jha, Umesh B. Sherkhane, Vinay Jaiswar, Nilendu C. Purandare, V. Rangarajan, A. Dekker, Sander Puts, Inigo Bermejo, L. Wee
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Informatics in Medicine Unlocked, Vol 40, Iss , Pp 101294- (2023)
Druh dokumentu: article
ISSN: 2352-9148
DOI: 10.1016/j.imu.2023.101294
Popis: Purpose: Manual cohort building from radiology reports can be tedious. Natural Language Processing (NLP) can be used for automated cohort building. In this study, we have developed and validated an NLP approach based on deep learning (DL) to select lung cancer reports from a thoracic disease management group cohort. Materials and methods: 4064 radiology reports (CT and PET/CT) of a thoracic disease management group reported between 2014 and 2016 were used. These reports were anonymised, cleaned, text normalized and split into a training, testing, and validation set. External validation was performed on radiology reports from the MIMIC-III clinical database. We used three DL models, namely, Bi-LSTM_simple, Bi-LSTM_dropout, and Pre-trained _BERT model to predict if a report concerned lung cancer. We studied the effect of minority oversampling on all models. Results: Without oversampling, the F1 scores at 95% CI for Bi-LSTM_simple, Bi-LSTM_dropout and BERT were 0.89, 0.90, and 0.86; with oversampling, the F1 scores were 0.94, 0.94, and 0.9, on internal validation. On external validation the F1-scores of Bi-LSTM_simple, Bi-LSTM_dropout and BERT models were 0.63, 0.77 and 0.80 without oversampling and 0.72, 0.78 and 0.77 with oversampling. Conclusion: Pre-trained BERT model and Bi-LSTM_dropout models to predict a lung cancer report showed consistent performance on internal and external validation with the BERT model exhibiting superior performance. The overall F1 score decreased on external validation for both Bi-LSTM models with the Bi-LSTM_simple model showing a more significant drop. All models showed some improvement on minority oversampling.
Databáze: Directory of Open Access Journals