CANAL -- Cyber Activity News Alerting Language Model: Empirical Approach vs. Expensive LLM

Autor:	Patel, Urjitkumar, Yeh, Fang-Chun, Gondhalekar, Chinmay
Rok vydání:	2024
Předmět:	Computer Science - Cryptography and Security Computer Science - Artificial Intelligence Computer Science - Computation and Language 68T50 68T07 (Primary) 03B65 91F20 (Secondary) I.2.7 I.2.1 I.5.1 I.5.2 I.5.4 H.3.3
Zdroj:	2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC), Houston, TX, USA, 2024, pp. 1-12
Druh dokumentu:	Working Paper
DOI:	10.1109/ICAIC60265.2024.10433839
Popis:	In today's digital landscape, where cyber attacks have become the norm, the detection of cyber attacks and threats is critically imperative across diverse domains. Our research presents a new empirical framework for cyber threat modeling, adept at parsing and categorizing cyber-related information from news articles, enhancing real-time vigilance for market stakeholders. At the core of this framework is a fine-tuned BERT model, which we call CANAL - Cyber Activity News Alerting Language Model, tailored for cyber categorization using a novel silver labeling approach powered by Random Forest. We benchmark CANAL against larger, costlier LLMs, including GPT-4, LLaMA, and Zephyr, highlighting their zero to few-shot learning in cyber news classification. CANAL demonstrates superior performance by outperforming all other LLM counterparts in both accuracy and cost-effectiveness. Furthermore, we introduce the Cyber Signal Discovery module, a strategic component designed to efficiently detect emerging cyber signals from news articles. Collectively, CANAL and Cyber Signal Discovery module equip our framework to provide a robust and cost-effective solution for businesses that require agile responses to cyber intelligence. Comment: Published in 2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC), Conference Date: 07-09 February 2024
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2405.06772 Zobrazit plný text záznamu View this record from Arxiv