Code Cartographer’s Diary
At last year’s Botconf, we have launched Malpedia [1], our community-driven approach to create a free and independent resource for rapid identification and actionable context when investigating malware. While only touching the surface of analysis possibilities last time (mostly surveying PE header characteristics), we want to take a deep dive in this talk, showing the results of more than two years of ongoing in-depth analysis efforts. This time, the focus will be set on the unpacked representatives of more than 700 families of Windows malware.
In the first part of this presentation, we will investigate the usage patterns of the Windows API as exposed by malware. For this, we extend ApiScout [2] with a method to extract API usage fingerprints. We will demonstrate how this information can be used to reliably identify and characterize malware families and that this information seems to capture habits of their respective authors to some degree.
In the second part, we will introduce SMDA [3], a minimalist recursive disassembler library that is optimized for accurate Control Flow Graph (CFG) recovery from memory dumps. SMDA’s output allows us to create a function index, which can be used to identify similar code. On the one hand, we can use this similarity information to recognize and measure how commonly 3rd party libraries are used in malware. On the other hand, we can also isolate the unique, characteristic code for families in order to derive detection signatures for them.
[1] https://malpedia.caad.fkie.fraunhofer.de
[2] https://github.com/danielplohmann/apiscout
[3] https://github.com/danielplohmann/smda