Abstrakt: |
Artificial intelligence (AI) is the discipline focused on enabling computers to operate autonomously without explicit programming. Within AI, computer vision is an emerging field tasked with endowing machines with the ability to interpret visual data from images and videos. Over recent decades, computer vision has found applications in diverse fields such as autonomous vehicles, information retrieval, surveillance, and understanding human behavior. Object detection, a key aspect of computer vision, employs deep neural networks to continually advance detection accuracy and speed. Its goal is to precisely identify objects within images or videos and assign them to specific classes. Object detection models typically consist of three components: a backbone network for feature extraction, a neck model for feature aggregation, and a head for prediction. The focus of this study lies on two stage detectors. This study aims to provide a comprehensive review of two stage detectors in object detection, followed by benchmarking to offer insights for researchers and scientists. By analyzing and understanding the efficacy of these models, this research seeks to guide future developments in the field of object detection within computer vision. [ABSTRACT FROM AUTHOR] |