Popis: |
Source Device Identification (SDI) is pivotal in multimedia forensics, as it entails the recognition of the device that captured a specific image or video. This paper introduces an innovative SDI method using log-Mel spectrograms from video audio, employing an optimized ResNet-based model enhanced with Neural Architecture Search and integrated with Gradient-weighted Class Activation Mapping (Grad-CAM) for insights into influential spectrogram regions. A strong emphasis on high-frequency components within the audio data is observed, allowing band-pass filtering to the input spectrograms to selectively retain high-frequency information, resulting in the highest classification accuracy of camera models. Experiments conducted on the VISION dataset, comprising data from 35 different devices, demonstrate the effectiveness of the proposed method in achieving accurate and interpretable SDI, marking the first application of explainable Artificial Intelligence (xAI) techniques resorting to Grad-CAM in this context. Furthermore, a bootstrap analysis is performed to evaluate the classification performance impact of the proposed methodology with and without the integration of Grad-CAM explanations. By assessing the Grad-CAM-driven method, featuring band-pass filtered log-Mel spectrograms against the state-of-the-art approaches, the high accuracy in SDI is illustrated. |