Abstrakt: |
We present our tool BinThavro , which helps to solve the following general problem : given two programs, how can we compare them? More precisely, how can we understand the similarities, but also the dissimilarities between both files? The most difficult but the most interesting case seems to be the case of executable (binary) files and this problem has an important application: the malware analysis. A malware is one of the main tools used by information warfare warriors, "bad guys" commonly called "cyberwarriors". Hélas, there are so many new malwares that appears quite each day that we need automatic tools to make the analysis faster, and more sure. In the Microsoft Security Intelligence Report it is pointed out that, for the first half of the year 2009, around 116 million malicious samples were detected "in the wild" while this number was around 95 million in the second half of the year 2008. Of course, there does not exist 116 million of dissimilar malwares, a lot of them are clones, similar or quite similar. This proves clearly that the "malware industry" is flourishing, and it is an important arm for the cyberwarriors involded in the information warfare. But of course, a lot of "new" malwares share large portions of codes with existing and already known malwares (a lot of malwares contains small or large parts of code that has been copied from another). Here, known means analyzed, i.e. we have understood for example what the malware does, how it is programmed, how we can detect him with the help of a static signature in an antivirus software and so on. Why does someone wants to analyze a malware? There are (at least) three reasons: we want to understand how we can be protected against it, with or without a antirus software; or, we want to understand how we can modify to create a new variant (possibly with new functionalities for example); we want to "name" a new malware (see (Gheorgescu 2005)); So, at the first glance, any tool that can be used to analyze a malware can have bad consequences because it will probably be used also to create new malwares. Yes, but this is true for any new language, any new compiler etc. So, beyond the basic idea of searching for a signature of a malware, is there an interest to develop new of better tools for malware or goodware analysis? Yes, for at least two reasons: 1.a new view is emerging the last years: if we have better tools to analyze quickly new malwares that are variants of known malwares, the malwares programmers have to work harder (and so, hopefully, longer) to create new malwares that are difficult to analyze; 2.in the few last years, a new threat has appeared in the tools used for the information warfare: Targeted Malware Attacks, i.e. malwares that are developed to attack a specific target. This is really a serious problem because the reaction of the AV community faced to a new malware depends a lot of the impact of this new malware. And there are so many new malwares that the analyze of new malwares is prioritized, ressources has to be managed in a balance between the importance of the threats and the ability to analyze a lot of files. There are some reasons why this problem is interesting also for goodwares, for example: we want to detect plagiarism or copyright infringement (mostly for goodwares); we want to understand the evolution of a software (both for goodwares and malwares); we want to detect vulnerabilities in an old version by comparing the patched and unpatched versions (mostly for malwares) we want to detect redundancy into a software (mostly for goodwares). … [ABSTRACT FROM AUTHOR] |