Browse Prior Art Database

Function, Method and System to Mine Frequent Instruction Sequences from Hardware Sample Data

IP.com Disclosure Number: IPCOM000194715D
Publication Date: 2010-Apr-09
Document File: 4 page(s) / 30K

Publishing Venue

The IP.com Prior Art Database

Abstract

Mining frequent instruction sequences from hardware sample data is to find out which instruction sequences appear at a lot of addresses and have significant ticks (or other events, i.e. icdirms, dcdirms and etc.) to locate optimization opportunities and identify workload characteristics. Above function is very important due to its following applications: Identify performance bottlenecks: Which sequences are hot? Identify hidden reasons for performance events: Which sequences tend to cause ICDirMs (DCDirMs, TLBMs, …) ? Identify workload characteristics: Is there a lot of Java (DB2, network, Disk I/O ) sequences in the workload? So that we can use such information to do capacity planning.Such information could also be used to detect virus.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 4

Function, Method and System to Mine Frequent Instruction Sequences from Hardware Sample Data

Design and Implementation (1/4)

Input


SET0 : a set of records (sid, seq, opcode, address, targetAddress, ticks)
minSupport
Output
ResultSET: a set of result records ( opcode sequence, totalTicks, count )
Algorithm
Step 1. Get all 1-len frequent sequences, using following steps:
• Scan SET0 once, partition SET0 into a group of subsets, with each subset corresponding with an opcode;

SET1 = { }, for each subset of SET0 , if its size > minSupport, add (opcode, totalTicks, subset size) into ResultSET, and add all records in the subset into SET1

SET1

IA_table

DISASSM_table

SID

Address

SID

Address

SEQ OPCODE TargetAddress …

SID

Address

SEQ OPCODE

TargetAddress …

SID

Address

SEQ OPCODE

TargetAddress …

L

LEN1_table (SET0)

ST

SID

Address

SEQ OPCODE

TargetAddress ……

SID

Address

SEQ OPCODE

TargetAddress …

TM

1

Page 2 of 4

Design and Implementation (2/4)

Input


SET0 : a set of records (sid, seq, opcode, address, targetAddress, ticks)
minSupport
Output
ResultSET: a set of result records ( opcode sequence, totalTicks, count )
Algorithm
Step 2. Get all 2-len frequent sequences, using following steps:
tempSET = SELECT A.*, B.* FROM SET1 A JOIN SET1 B ON (A. sid = B. sid AND A.

seq = B. seq-1) OR ( A. targetAddress = B. address)

• scan tempSET once, partition tempSET into a group of subsets, with each subset corresponding with a 2-len opcode sequence: "opcode, opcode1";

SET2 = { }, for each subset of tempSET , if its size > minSupport, add ("opcode, opcode1", totalTicks, subset size) into ResultSET, and add all records in the subset into SET2

TargetAddress ……

(SET1) (SET1)

Address

SID TargetAddress ……

SEQ...