Independently developed codebases typically contain many segments of code that perform the same or closely related operations. Being able to detect these related segments is helpful to applications such as reverse engineering. In this paper, we tackle the problem of determining whether two segments of binary code perform the same operation by asking whether one segment can be substituted by the other. A key insight behind our approach is that because these segments often have different interfaces, some glue code (an adapter) will be needed to perform the substitution. Here we present an algorithm that searches for substitutable code segments by attempting to synthesize an adapter between them. We implement our technique using concrete adapter enumeration and binary symbolic execution to explore the relation between size of adapter search space and total search time. Then, using more than 61,000 fragments of binary code extracted from a ARM image built for the iPod Nano 2g device and functions from the VLC media player, we evaluate our adapter synthesis implementation on more than one million synthesis tasks. Our tool finds dozens of instances of VLC functions in the firmware image. These results confirm that instances of adaptably substitutable binary functions exist in real-world code, and suggest that adapter synthesis has promising reverse engineering applications.
|Original language||English (US)|
|Title of host publication||Proceedings - 2018 IEEE 11th International Conference on Software Testing, Verification and Validation, ICST 2018|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||11|
|State||Published - May 25 2018|
|Event||11th IEEE International Conference on Software Testing, Verification and Validation, ICST 2018 - Vasteras, Sweden|
Duration: Apr 9 2018 → Apr 13 2018
|Name||Proceedings - 2018 IEEE 11th International Conference on Software Testing, Verification and Validation, ICST 2018|
|Other||11th IEEE International Conference on Software Testing, Verification and Validation, ICST 2018|
|Period||4/9/18 → 4/13/18|
Bibliographical noteFunding Information:
This material is based on work supported by the National Science Foundation under Grant Number 1563920. This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA) under contract FA8750-15-C-0110. The views and/or findings contained in this article are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the US Government.
© 2018 IEEE.
- binary analysis
- program synthesis
- symbolic execution