A Novel Transformer Model With Multiple Instance Learning for Diabetic Retinopathy Classification

Autor: Yaoming Yang, Zhili Cai, Shuxia Qiu, Peng Xu
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 6768-6776 (2024)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3351473
Popis: Diabetic retinopathy (DR) is an irreversible fundus retinopathy. A deep learning-based automated DR diagnosis system can save diagnostic time. While Transformer has shown superior performance compared to Convolutional Neural Network (CNN), it typically requires pre-training with large amounts of data. Although Transformer-based DR diagnosis method may alleviate the problem of limited performance on small-scale retinal datasets by loading pre-trained weights, the size of input images is restricted to $224\times 224$ . The resolution of retinal images captured by fundus cameras is much higher than $224\times 224$ , reducing resolution in training will result in the loss of valuable information. In order to efficiently utilize high-resolution retinal images, a new Transformer model with multiple instance learning (TMIL) is proposed for DR classification. A multiple instance learning approach is firstly applied on the retinal images to segment these high-resolution images into $224\times 224$ image patches. Subsequently, Vision Transformer (ViT) is used to extract features from each patch. Then, Global Instance Computing Block (GICB) is designed to calculate the inter-instance features. After introducing global information from GICB, the features are used to output the classification results. When using high-resolution retinal images, TMIL can load pre-trained weights of Transformer without being affected by weight interpolation on model performance. Experimental results using the APTOS dataset and the Messidor-1 dataset demonstrate that TMIL achieves better classification performance and reduces inference time by 62% compared with that directly inputting high-resolution images into ViT. And TMIL shows highest classification accuracy compared with the current state-of-the-art results. The code will publicly available at https://github.com/CNMaxYang/TMIL.
Databáze: Directory of Open Access Journals