Browse Prior Art Database

Extracting a part of the program

IP.com Disclosure Number: IPCOM000185686D
Original Publication Date: 2009-Jul-31
Included in the Prior Art Database: 2009-Jul-31
Document File: 5 page(s) / 83K

Publishing Venue

IBM

Abstract

Often a developer has to write a program that is very similar to an existing one. In such case using the code of the existing program is much easier than writing from scratch. However, extracting the needed parts of the code can be a difficult task even if the source code is available. The approach described in this article finds the parts of the code that are responsible for determined functionality of the program. The method used is based on building the coverage of the executed code by tracing all branch instructions in the machine code of the investigated program and then mapping it with the source code. This approach reduces the time of extracting the part of the program significantly and eliminates a lot of routine work. It can also be applied for estimating the coverage of unit tests.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 1 of 5

Extracting a part of the program

Introduction:

This solution was created out of the task to make a program (A) that has very similar basic functionality with another program (B). Main task is to find a method how to extract part of program instead of rewriting it from scratch. In order to do so a programmer needs to know what part of the source program (B) code is responsible for this functionality. User can run the program (B) and execute required functionality. It is enough to know what part of the code (B) was executed to perform these functions. It will contain all or more code than required one, excluding error handling.

Task:

What part of the code was executed to perform the defined actions of the program?

Solution description:

Part 1. Tracing.

a) Save load address of the module, last address of the module and exit point. LOAD

_ADDR - load address of the module.

LAST

_ADDR - last address of the module.

LAST

_ADDR - LOAD

_

ADDR + 1 = SIZE - size of the module.

EXIT

_ADDR(1…N) - exit points of the module.

b) If the load address and start address are different than start tracing the whole module and write down the first instruction address
TRACE INST FROM LOAD

_ADDR TO LAST

_ADDR and write down

ENTRY

_POINT

else ENTRY

_

POINT = LOAD

_ADDR

c) Start tracing all branch instruction from load address to last address of the module. Start tracing all exit points
TRACE BRANCH FROM LOAD

_ADDR TO LAST

_ADDR

TRACE INST FROM EXIT

ADDR(1) TO EXIT

_ADDR(1)

TRACE INST FROM EXIT

_

_ADDR(N) TO EXIT

_ADDR(N)

d) Start executing the defined actions we wanted to find code for.

e) For each breakpoint save instruction address and branch address. INST

_

ADDR: BRANCH

_INST BRANCH

_ADDR

f) Ignore the instructions with branch address out of the interval [LOAD

_

ADDR,

LAST

_ADDR]

BRANCH

_

ADDR < LOAD

_ADDR OR BRANCH

_

ADDR > LOAD

_

ADDR - ignore

g) Stop tracing on exit point. Now we have a list of all branch instructions INSTRUCTION
SEQUENCE

INSTRUCTION ADDRESS

BRANCH ADDRESS

1

Page 2 of 5

0 ENTRY

_POINT

None

1 INST

_ADDR(1)

BRANCH

_ADDR(1)

2 INST

_ADDR(2)

BRANCH

_ADDR(2)

… … … K-1 INST

_ADDR(K-1)

BRANCH

_ADDR(K-1)

K EXIT

_POINT(n)

none

Part 2. Compiling.

Build interval list of executed instructions:
Code was executed from target address of current branch instruction occurrence till the next branch instruction occurrence address.

First interval is from load-address to instruction address of the first branch. Next intervals are from branch address to next branch instruction occurrence address. And last one is from last branch instruction till the exit point. There are K-1 intervals. This intervals may intersect each other.

Interval number Interval start Interval end I ENTRY

_POINT

INST

_ADDR(1)

1 BRANCH

_ADDR(1)

INST

_ADDR(2)

2 BRANCH

_ADDR(2)

INST

_ADDR(3)

… … … K-2 BRANCH

_ADDR(K-2)

INST

_ADDR(K-1)

K-1 BRANCH

_ADDR(K-1)

EXIT

_POINT(n)

These intervals should be shifted left by value LOAD

_ADDR in order to...