Popis: |
Ensuring the security and safety of passengers and cargo through effective baggage screening presents a critical challenge in high-traffic environments like airports, where traditional manual processes are affected by high error rates, fatigue among security personnel, and privacy concerns. These issues underscore the urgent need for sophisticated automated systems capable of accurately detecting concealed threats. Advancements in deep learning have introduced more refined approaches to baggage screening, yet they struggle with capturing the global context of threats and challenges presented by clutter, concealment, and occlusion. Addressing these issues, this paper presents the Multi-Scale Hierarchical Transformer for X-ray Detection (MHT-X), a novel framework that surpasses conventional limitations by integrating a multi-scale contour mapping technique with a hierarchical feature extraction process using visual Transformers. MHT-X is distinctive in its ability to generate multi-level contour maps that unveil complex features of threat objects, facilitating their accurate identification. It employs a top-down approach for hierarchical feature processing, capturing vital spatial characteristics essential for precise threat localization through an efficient encoder-decoder mechanism. MHT-X represents a significant advancement in the detection of concealed and obstructed threat objects in X-ray baggage scans. By combining multi-scale contour maps with a hierarchical feature extraction process, it significantly enhances detection accuracy and outperforms traditional Convolutional Neural Network (CNN) models by effectively capturing the global context of illegal items with advanced feature representation. The proposed framework is rigorously validated on two public datasets, dubbed CLCXray, and PIDray, where it outperforms the state-of-the-art (SOTA) methods by achieving the mean average precision score of 65.26%, and 78.19%, respectively. |