Optimal number of strong labels for curriculum learning with convolutional neural network to classify pulmonary abnormalities in chest radiographs
Autor: | Namkug Kim, Joon Beom Seo, Beomhee Park, Kyung Hee Lee, Sang Min Lee, Yongwon Cho |
---|---|
Rok vydání: | 2021 |
Předmět: |
Lung Diseases
medicine.medical_specialty medicine.diagnostic_test Pleural effusion business.industry Radiography Deep learning External validation Health Informatics medicine.disease Convolutional neural network Computer Science Applications Pneumothorax medicine Humans Learning Artificial intelligence Radiology Curriculum Neural Networks Computer business Chest radiograph |
Zdroj: | Computers in biology and medicine. 136 |
ISSN: | 1879-0534 |
Popis: | Background and objective It is important to alleviate annotation efforts and costs by efficiently training on medical images. We performed a stress test on several strong labels for curriculum learning with a convolutional neural network to differentiate normal and five types of pulmonary abnormalities in chest radiograph images. Methods The numbers of CXR images of healthy subjects and patients, acquired at Asan Medical Center (AMC), were 6069 and 3465, respectively. The numbers of CXR images of patients with nodules, consolidation, interstitial opacity, pleural effusion, and pneumothorax were 944, 550, 280, 1360, and 331, respectively. The AMC dataset was split into training, tuning, and test, with a ratio of 7:1:2. All lesions were strongly labeled by thoracic expert radiologists, with confirmation of the corresponding CT. For curriculum learning, normal and abnormal patches (N = 26658) were randomly extracted around the normal lung and strongly labeled abnormal lesions, respectively. In addition, 1%, 5%, 20%, 50%, and 100% of strong labels were used to determine an optimal number for them. Each patch dataset was trained with the ResNet-50 architecture, and all CXRs with weak labels were used for fine-tuning them in a transfer-learning manner. A dataset acquired from the Seoul National University Bundang Hospital (SNUBH) was used for external validation. Results The detection accuracies of the 1%, 5%, 20%, 50%, and 100% datasets were 90.51, 92.15, 93.90, 94.54, and 95.39, respectively, in the AMC dataset and 90.01, 90.14, 90.97, 91.92, and 93.00 in the SNUBH dataset. Conclusions Our results showed that curriculum learning with over 20% sampling rate for strong labels are sufficient to train a model with relatively high performance, which can be easily and efficiently developed in an actual clinical setting. |
Databáze: | OpenAIRE |
Externí odkaz: |