Browse Prior Art Database

Method for a highly precise cost-effective hardware apparatus to collect CPU state information and system bus activity for RTL simulation

IP.com Disclosure Number: IPCOM000008845D
Publication Date: 2002-Jul-17
Document File: 4 page(s) / 45K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a highly precise cost-effective hardware apparatus to collect CPU state information and system bus activity for register transistor logic (RTL) simulation. Benefits include improved trace capability.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Method for a highly precise cost-effective hardware apparatus to collect CPU

state information and system bus activity for RTL simulation

Disclosed is a method for a highly precise cost-effective hardware apparatus to collect CPU

state information and system bus activity for register transistor logic (RTL) simulation. Benefits include improved trace capability.

Background

              To simulate a post-silicon failure on a 64-bit architecture platform requires a highly precise trace-capture apparatus. Using specialized and novel techniques we have used a commercial logic analyzer (LA) as a cost-effective trace capture apparatus.

              A commercial LA imposes limitations on the trace buffer size. The maximum size of the trace buffer of a typical LA is less than 32-MB bus samples. The trace from a 64-bit system could be many times larger making it infeasible to support the simulation of system traces in RTL. Each new generation of the processor family requires larger trace buffers due to faster and wider buses, larger and more number of microarchitecture structures and larger caches. The trace buffer size of a commercial LA is not only inadequate but also very expensive.

Description

              The disclosed method enables RTL simulation of multiprocessor traces using a commercial LA with a depth of only 16 MB. In addition to optimizing the LA trace-buffer usage, the method addresses other limitations inherent to simulating traces, such as support for odd front-side bus ratios.

Trace buffer reuse

              Even though a test may take several minutes or hours to fail, all the interesting events leading up to the failure usually happens in less than 1-million clock cycles (designated as n). To replicate the failure in an RTL environment, these n clocks are adequate as long as we capture the CPU state just prior to executing them. To be able to capture traces for millions of clock cycles and only retain a portion of them, the LA is configured as a wrap around trace buffer of size equal to n plus the CPU state samples. No matter how long the test runs, the trace buffer contains the last n cycles before failure.

Filtering to reduce trace length

              The CPU state dump and system bus trace along with ITP activity can take over 100 million bus clocks, which exceeds the capacity of a typical LA trace buffer. However, of these cycles, no more than 3-million active bus clocks occur per processor. Storing just these active samples is sufficient. A filter store only active samples. The bus activity is determined by the control signals (see Figure 1).

              To simulate the trace in RTL, all cycles including idle cycles are required. The idle cycles are reconstructed with the aid of a clock-stamp counter. The clock-stamp count value, which is recorded in the trace buffer whenever an active cycle is saved, is processed by software to reconstruct idle cycles.

Triggering for system hang

              When a failure is characterized sufficiently so that the processor breaks at ...