Popis: |
Background: The machine-assisted recognition of colorectal cancer has been mainly focused on supervised deep learning that suffer from a significant bottleneck of requiring massive labeled data. We hypothesize that semi-supervised deep learning leveraging a small number of labeled data can provide a powerful alternative strategy.Method: We proposed a semi-supervised model based on mean teacher that provide pathological predictions at both patch-level and patient-level. We demonstrated the general utility of the model utilizing 13,111 whole slide images from 8,803 subjects gathered from 13 centers. We compared our proposed method with the prevailing supervised learning and six pathologists.Results: with a small amount of labeled training patches (~3,150 labeled, ~40,950 unlabeled or ~6,300 labeled,~37,800 unlabeled), the semi-supervised model performed significantly better than the supervised model (AUC: 0.90 ± 0.06 vs. 0.84 ± 0.07, P value = 0.02 or AUC: 0.98 ± 0.01 vs 0.92 ± 0.04, P value = 0.0004). Moreover, we found no significant difference between the supervised model using massive ~44,100 labeled patches and the semi-supervised model (~6,300 labeled, ~37,800 unlabeled) at patch-level diagnoses (AUC: 0.98 ± 0.01 vs 0.987 ± 0.01, P value = 0.134) and patient-level diagnoses (average AUC: 97.40% vs. 97.96%, P value = 0.117) . Our model was close to human pathologists (average AUC: 97.17% vs. 96.91%).Conclusions: We reported that semi-supervised learning can achieve excellent performance through a multi-center study. We thus suggested that semi-supervised learning has great potentials to build artificial intelligence (AI) platforms, which will dramatically reduce the cost of labeled data and greatly facilitate the development and application of AI in medical sciences. |