Popis: |
In the era of advanced computer vision and natural language processing, the use of social media as a source of information has become even more valuable in directing aid and rescuing victims. Consequently, millions of texts and images can be processed in real-time, allowing emergency responders to efficiently assess evolving crises and appropriately allocate resources. The majority of the previous detection studies are text-only or image-only based, overlooking the potential benefits of integrating both modalities. In this paper, we propose Multimodal Channel Attention (MCA) block, which employs an adaptive attention mechanism, learning to assign varying importance to each modality. We then propose a novel Deep Multimodal Crisis Categorization (DMCC) framework, which employs a two-level fusion strategy for better integration of textual and visual information. The DMCC framework consists of feature-level fusion, which is accomplished through the MCA block, and score-level fusion, whereby the decisions made by the individual modalities are integrated with those of the MCA model. Extensive experiments on publicly available datasets demonstrate the effectiveness of the proposed framework. Through a comprehensive evaluation, it was found that the proposed framework achieves a performance enhancement compared to unimodal methods. Furthermore, it outperforms the current state-of-the-art methods on crisis-related categorization tasks. The code is available at https://github.com/MarihamR/Categorizing-Crises-from-Social-Media-Feeds-Via-Multimodal-Channel-Attention. |