From big data to knowledge: A spatio-temporal approach to malware detection
Autor: | Weixuan Mao, Yuan Yang, Xiaohong Shi, Xiaohong Guan, Zhongmin Cai |
---|---|
Rok vydání: | 2018 |
Předmět: |
General Computer Science
Computer science business.industry Big data Cloud computing 02 engineering and technology computer.file_format computer.software_genre Software Security service 020204 information systems Scalability 0202 electrical engineering electronic engineering information engineering Malware 020201 artificial intelligence & image processing Timestamp Executable business Law computer Computer network |
Zdroj: | Computers & Security. 74:167-183 |
ISSN: | 0167-4048 |
DOI: | 10.1016/j.cose.2017.12.005 |
Popis: | The deployment of endpoint protection has been gradually migrated from individual clients to remote cloud servers, which is termed as cloud based security service. The new paradigm of security defense produces a large amount of data and log files, and motivates data-driven techniques for detecting malicious software. This paper conducts an empirical study on the log of a real cloud based security service to characterize the occurrence of executable files in end hosts, which concerns 124,782 benign and 113,305 malicious executable files occurred in 165,549,417 end hosts. The end hosts and the timestamps that an executable file occurs in provide insights into the distribution of software in wild from spatial and temporal perspectives, respectively. Meanwhile, we investigate the strategies behind the characterizations, and observe the preferential attachment process and the periodicity of file occurrence in end hosts. The observed different occurrence patterns of benign and malicious files in end hosts inspire us a new scalable approach to malware detection. We learn from the characterizations that, the associated files shared more spatial and temporal information in common are more likely to be same in their labels, either benign or malicious. Thus, we devise a graph based semi-supervised learning algorithm for real-time malware detection by taking into account the spatio-temporal information of the distribution of executable files. Experimental results demonstrate that our approach increases the performance on malware detection by 14.7% over previous techniques on average. |
Databáze: | OpenAIRE |
Externí odkaz: |