Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method and Apparatus for Handling Self Modifying Code in Dynamic Binary Translated Simulators

IP.com Disclosure Number: IPCOM000245170D
Publication Date: 2016-Feb-16
Document File: 6 page(s) / 249K

Publishing Venue

The IP.com Prior Art Database

Abstract

Instruction set simulation requires execution of one architecture (simulated or target machine) on another architecture (host machine). The two architectures may be the same or different. There are many techniques to develop instruction set simulators [1], [2], [3]. Software interpreters simulate in software the fetch-decode-execute cycle of the simulated architecture by reading one instruction at a time, decoding it and simulating its execution. Dynamic binary translators offer superior performance by taking a group of instructions (often called basic blocks) and generating a corresponding sequence that executes directly on the host processor. The translated code sequences are stored in a buffer called the translation cache. Binary translation function is interleaved with the execution of the output of the binary translator. Before executing a new input sequence, it is checked if the translated sequence exists in the translation cache. If found, the previously translated sequence is executed again. This helps save decode overhead with typical interpretive simulators.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 18% of the total text.

Method and Apparatus for Handling Self Modifying Code in Dynamic Binary Translated Simulators

Introduction

Instruction set simulation requires execution of one architecture (simulated or target machine) on another architecture (host machine). The two architectures may be the same or different.  There are many techniques to develop instruction set simulators [1], [2], [3]. Software interpreters simulate in software the fetch-decode-execute cycle of the simulated architecture by reading one instruction at a time, decoding it and simulating its execution. Dynamic binary translators offer superior performance by taking a group of instructions (often called basic blocks) and generating a corresponding sequence that executes directly on the host processor. The translated code sequences are stored in a buffer called the translation cache. Binary translation function is interleaved with the execution of the output of the binary translator. Before executing a new input sequence, it is checked if the translated sequence exists in the translation cache. If found, the previously translated sequence is executed again. This helps save decode overhead with typical interpretive simulators.

Background

Due to the use of a translation cache in dynamic binary translated simulators, execution of translated code sequences can lead to correct execution of the program only if the original sequence that it simulates has not been modified since its translation. Computer architectures permit program instructions to write to an address space allocated for program instructions. This type of code is commonly referred to as “Self-Modifying Code”.  If a target instruction has been translated and stored in the translation cache, the translated instruction in the translation cache will become obsolete if the target instruction is modified during further execution. This is also referred to as translation cache coherency problem.

Self-Modifying Code scenarios occur in practice in several situations, some of which are described here. First, a computer system has a finite amount of memory, which forces operating systems to reuse the same portion of memory to store different programs at different times. If translation cache coherency is not properly handled for this scenario, it will lead to incorrectly executing a previously stored program even though the operating system expects the execution of the newly stored program. Second, some systems employ a programming technique in which an instruction sequence repeatedly modifies itself during its own execution. This is typically used in areas where performance is critical like in the innermost loops of graphics drivers. Third, code patching is used to resolve external symbol conflicts between modules that are compiled separately. At compilation time the first module will not know the address of a function that is stored in a second module. The address of this function will be known at run time by the operating system. The origi...