Browse Prior Art Database

Binary Disassembly Pattern Analysis

IP.com Disclosure Number: IPCOM000231860D
Original Publication Date: 2013-Oct-10
Included in the Prior Art Database: 2013-Oct-10
Document File: 4 page(s) / 785K

Publishing Venue

Microsoft

Related People

Vignesh Murugesan: INVENTOR [+2]

Abstract

The invention describes a method for analyzing/searching patterns to find vulnerable variations of the patterns in a disassembly of a binary. The method (a) searches the disassembly data for a given source code pattern, (b) uses a disassembly pattern for generating semantically similar code patterns, (c) detects interlocking patterns in a source code, (d) scopes conditional and unconditional jumps across assembly code blocks in a given function space, (e) automatically constructs matches from generic symbols in a user defined search query, (f) searches appropriate assembly patterns for a given source code line and further maps the matched assembly results back to their respective source code lines, and (g) autosuggests possible assembly pattern variants using search history and heuristics.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 59% of the total text.

Document Author (alias)

vigneshm

Defensive Publication Title 

Binary Disassembly Pattern Analysis

Name(s) of All Contributors

Navjot S. Rattan

Vignesh Murugesan

Summary of the Defensive Publication/Abstract

The invention describes a method for analyzing/searching patterns to find vulnerable variations of the patterns in a disassembly of a binary. The method (a) searches the disassembly data for a given source code pattern, (b) uses a disassembly pattern for generating semantically similar code patterns, (c) detects interlocking patterns in a source code, (d) scopes conditional and unconditional jumps across assembly code blocks in a given function space, (e) automatically constructs matches from generic symbols in a user defined search query, (f) searches appropriate assembly patterns for a given source code line and further maps the matched assembly results back to their respective source code lines, and (g) autosuggests possible assembly pattern variants using search history and heuristics.

Description:  Include architectural diagrams and system level data flow diagrams if: 1) they have already been prepared or 2) they are needed to enable another developer to implement your defensive publication. Target 1-2 pages, and not more than 5 pages.  

  After accepting the user defined binary and the symbol server holding its symbols, we get the right symbols from the symbol server and disassemble the binary. The Disassembled data is then parsed and fed into our database which then segments it based on different functions present. The Search module queries the function data and searches them individually. The DB fetch and search parts are synchronized and multithreaded for efficiency. Post each fetch, the instruction sets (ignoring the parameters at this point of time) that can possibly match the user defined instruction set query are built and each possible data set is matched with the user query, this time including parameters. This algorithm lets us avoid scenarios where pattern queries whose matches in the disassembled data are intertwined with other such matches resulting in those matches becoming undetectable during search. It also helps avoid path poisoning where an invalid path might spoil detection of a valid path. This is done without any backtracking on the disassembly data as the data stream is optimized for sequential access.   During matching, the user query which has placeholder symbols is compared against the possible matches in disassembly data and the placeholder symbols are constructed in parallel. An incompatibility in placeholder match denotes a mismatch, in which case we move on with our search. We also detect conditional and unconditional jumps that happen across assembly code blocks and match only if the program flow sequence of matched instructions is a definite possibility – ensuring accuracy of each match.   The User either gives a source code line from which we get the appropriate assembly pattern to search for or the use...