Performance Evaluation of Different Object Detection Models for the Segmentation of Optical Cups and Discs

Autor:	Gendry Alfonso-Francia, Jesus Carlos Pedraza-Ortega, Mariana Badillo-Fernández, Manuel Toledano-Ayala, Marco Antonio Aceves-Fernandez, Juvenal Rodriguez-Resendiz, Seok-Bum Ko, Saul Tovar-Arriaga
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	glaucoma segmentation object detection Mask R-CNN Cascade Mask R-CNN instance segmentation Medicine (General) R5-920
Zdroj:	Diagnostics, Vol 12, Iss 12, p 3031 (2022)
Druh dokumentu:	article
ISSN:	2075-4418
DOI:	10.3390/diagnostics12123031
Popis:	Glaucoma is an eye disease that gradually deteriorates vision. Much research focuses on extracting information from the optic disc and optic cup, the structure used for measuring the cup-to-disc ratio. These structures are commonly segmented with deeplearning techniques, primarily using Encoder–Decoder models, which are hard to train and time-consuming. Object detection models using convolutional neural networks can extract features from fundus retinal images with good precision. However, the superiority of one model over another for a specific task is still being determined. The main goal of our approach is to compare object detection model performance to automate segment cups and discs on fundus images. This study brings the novelty of seeing the behavior of different object detection models in the detection and segmentation of the disc and the optical cup (Mask R-CNN, MS R-CNN, CARAFE, Cascade Mask R-CNN, GCNet, SOLO, Point_Rend), evaluated on Retinal Fundus Images for Glaucoma Analysis (REFUGE), and G1020 datasets. Reported metrics were Average Precision (AP), F1-score, IoU, and AUCPR. Several models achieved the highest AP with a perfect 1.000 when the threshold for IoU was set up at 0.50 on REFUGE, and the lowest was Cascade Mask R-CNN with an AP of 0.997. On the G1020 dataset, the best model was Point_Rend with an AP of 0.956, and the worst was SOLO with 0.906. It was concluded that the methods reviewed achieved excellent performance with high precision and recall values, showing efficiency and effectiveness. The problem of how many images are needed was addressed with an initial value of 100, with excellent results. Data augmentation, multi-scale handling, and anchor box size brought improvements. The capability to translate knowledge from one database to another shows promising results too.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/605d7b433fb441f4926ffc7485a8ae52 Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.