Dynamic binary analysis is a prevalent and indispensable technique in program analysis. While several dynamic binary analysis tools and frameworks have been proposed, all suffer from one or more of: prohibitive performance degradation, a semantic gap between the analysis code and the program being analyzed, architecture/OS specificity, being user-mode only, and lacking APIs. We present DECAF, a virtual machine based, multi-target, whole-system dynamic binary analysis framework built on top of QEMU. DECAF provides Just-In-Time Virtual Machine Introspection and a plugin architecture with a simple-to-use event-driven programming interface. DECAF implements a new instruction-level taint tracking engine at bit granularity, which exercises fine control over the QEMU Tiny Code Generator (TCG) intermediate representation to accomplish on-the-fly optimizations while ensuring that the taint propagation is sound and highly precise. We perform a formal analysis of DECAF's taint propagation rules to verify that most instructions introduce neither false positives nor false negatives. We also present three platform-neutral plugins - Instruction Tracer, Keylogger Detector, and API Tracer, to demonstrate the ease of use and effectiveness of DECAF in writing cross-platform and system-wide analysis tools. Implementation of DECAF consists of 9,550 lines of C++ code and 10,270 lines of C code and we evaluate DECAF using CPU2006 SPEC benchmarks and show average overhead of 605 percent for system wide tainting and 12 percent for VMI.
Bibliographical noteFunding Information:
This research was supported in part by US National Science Foundation Grant #1018217, US National Science Foundation Grant #1054605, and McAfee Inc.
- Dynamic binary analysis
- dynamic taint analysis
- virtual machine introspection