Browse Prior Art Database

Dye-based binary imaging to find bugs in software

IP.com Disclosure Number: IPCOM000199851D
Original Publication Date: 2010-Sep-17
Included in the Prior Art Database: 2010-Sep-17
Document File: 5 page(s) / 49K

Publishing Venue

Microsoft

Related People

Tim Burrell: INVENTOR

Abstract

In biology, radio-imaging involves a substance such as a barium meal being used to detect abnormalities in the digestive system; or a dye can be used to track cell cycle transformations.

This text was extracted from a Microsoft Word document.
This is the abbreviated version, containing approximately 28% of the total text.

Document Author (alias)

Tim Burrell (timb)

Defensive Publication Title 

Dye-based binary imaging to find bugs in software

Name(s) of All Contributors

 

 

 

 

 

Summary of the Defensive Publication/Abstract

In biology, radio-imaging involves a substance such as a barium meal being used to detect abnormalities in the digestive system; or a dye can be used to track cell cycle transformations.

The process described here for finding abnormalities in software is analogous: instrumentation code (the “dye”) is inserted into the program source code, then the program is compiled (“digested”/”transformed”), and the resulting binary is examined for abnormalities, which can be mapped back to code bugs. The instrumentation is chosen so that interaction of the instrumentation code with surrounding buggy code during compilation will make the examination stage simple and efficient.

 

Description:  Include architectural diagrams and system level data flow diagrams if: 1) they have already been prepared or 2) they are needed to enable another developer to implement your defensive publication. Target 1-2 pages, and not more than 5 pages.  

Dye-based binary imaging to find bugs in software

Overview

Extra instructions are inserted into the program source code (instrumentation). The program is compiled and optimized: this results in the instrumentation instructions being combined with the surrounding instructions, resulting in new and different instructions. These new instructions form the basis of the program binary, which is the output of the compilation/optimization process.

The premise of dye-based binary imaging is that for suitably chosen instrumentation then compiling/optimizing “buggy code + instrumentation code” will result in specific instructions/patterns that will stand out in the resulting program binary. This allows these patterns to be mapped back to the corresponding source code bugs. Compiling/optimizing “non-buggy code + instrumentation code” will not result in these patterns.

Related ideas

We start by briefly mentioning two related concepts and how the approach described here differs from these existing techniques.

(1)   Taint-based dataflow analysis

The similarity here is that a dye-like entity is tracked through the program. The difference is that in dataflow analysis the “dye” used is input data to the program rather than modifications of the program itself. Correctly tracking dataflow statically (ie without running the software) is complex and prone to false positives due both to limitations of the analysis techniques available and for theoretical reasons (eg code ambiguity may prevent deterministic resolution of function calls).

Because the focus is on user data, the dataflow analysis approach typically finds a different class of bug: “misuse of user data” rather than the “illegal program instructions” bugs that the dye-based binary imaging method finds.

(2)   Runtime instrumentation

Modification of a program for the purpo...