DisTGranD: Granular event/sub-event classification for disaster response

Autor: Adesokan, Ademola, Madria, Sanjay, Nguyen, Long
Zdroj: Online Social Networks and Media; 20240101, Issue: Preprints
Abstrakt: Efficient crisis management relies on prompt and precise analysis of disaster data from various sources, including social media. The advantage of fine-grained, annotated, class-labeled data is the provision of a diversified range of information compared to high-level label datasets. In this study, we introduce a dataset richly annotated at a low level to more accurately classify crisis-related communication. To this end, we first present DisTGranD, an extensively annotated dataset of over 47,600 tweets related to earthquakes and hurricanes. The dataset uses the Automatic Content Extraction (ACE) standard to provide detailed classification into dual-layer annotation for events and sub-events and identify critical triggers and supporting arguments. The inter-annotator evaluation of DisTGranD demonstrated high agreement among annotators, with Fleiss Kappa scores of 0.90 and 0.93 for event and sub-event types, respectively. Moreover, a transformer-based embedded phrase extraction method showed XLNet achieving an impressive 96% intra-label similarity score for event type and 97% for sub-event type. We further proposed a novel deep learning classification model, RoBiCCus, which achieved ≥90%accuracy and F1-Score in the event and sub-event type classification tasks on our DisTGranD dataset and outperformed other models on publicly available disaster datasets. DisTGranD dataset represents a nuanced class-labeled framework for detecting and classifying disaster-related social media content, which can significantly aid decision-making in disaster response. This robust dataset enables deep-learning models to provide insightful, actionable data during crises. Our annotated dataset and code are publicly available on GitHub11GitHub Repository for DisTGranD Dataset and Code..
Databáze: Supplemental Index