Independently developed codebases typically contain many segments of code that perform same or closely related operations (semantic clones). Finding functionally equivalent segments enables applications like replacing a segment by a more efficient or more secure alternative. Such related segments often have different interfaces, so some glue code (an adapter) is needed to replace one with the other. In previous work, we presented an algorithm that searches for replaceable code segments by attempting to synthesize an adapter between them from some finite family of adapters; it terminates if it finds no possible adapter. In this work, we compare binary symbolic execution-based adapter search with concrete adapter enumeration based on Intel's Pin framework, and explore the relation between size of adapter search space and total search time. We present examples of applying adapter synthesis for improving security of binary functions and switching between binary implementations of RC4. We present two large-scale evaluations: (1) we run adapter synthesis on more than 13,000 function pairs from the Linux C library, and (2) we reverse engineer fragments of ARM binary code by running more than a million adapter synthesis tasks. Our results confirm that several instances of adaptably equivalent binary functions exist in real-world code, and suggest that adapter synthesis can be applied for automatically replacing binary code with its adaptably equivalent variants.
Bibliographical noteFunding Information:
This material is based on work supported by the National Science Foundation under Grant Number 1563920. This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA) under contract FA8750-15-C-0110. The views and/or findings contained in this article are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the US Government.
© 1976-2012 IEEE.
- Symbolic execution
- binary analysis
- equivalence checking
- program synthesis