Browse Prior Art Database

Method for an intelligent kernel to accelerate the functional characterization of CPU/platform anomalies

IP.com Disclosure Number: IPCOM000008846D
Publication Date: 2002-Jul-17
Document File: 5 page(s) / 112K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for an intelligent kernel to accelerate the functional characterization of CPU/platform anomalies. Benefits include improved functionality, improved performance, and improved testing/debugging productivity.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Method for an intelligent kernel to accelerate the functional characterization of CPU/platform anomalies

Disclosed is a method for an intelligent kernel to accelerate the functional characterization of

CPU/platform anomalies. Benefits include improved functionality, improved performance, and improved testing/debugging productivity.

Background

              To perform root-cause analysis of failures on multi-processor systems, machine state information is required. Status and content information is required for several structures, including:

·        Registers

·        Translation look-aside buffers (TLBs)

·        Branch prediction tables

·        Level-1 cache

·        Level-2 cache

              The conventional method of reading a cache array is the “brute force” approach, where each line of the tag array is read one entry at a time and written out to the front-side bus. The same procedure is repeated for the data array (see Figure 1). This method is time consuming. A typical result is similar to the following:

Tag1

Tag2

Tag3

.

.

TagN

Data1

Data2

Data3

.

.

DataN

              CPU systems can run the front-side bus clock at various multiples of the core clock. This situation changes the parameters that must be set in the test register to read the caches. Conventionally, this situation has required either new diagnostic code to be loaded each time the bus or core frequency changes, or lookup tables in the code for each ratio, which degrades performance.

 


Description

              The disclosed method is a procedure for reading storage structures in a way that significantly speeds up root-cause analysis time. This method reads the tag array line by line. However, the data array is only read for that address if the tag's valid field is set and the line is in a noninvalid state (P/M/S). For example, tags 1 and 3 are valid and modified, but tag 2 is invalid. The cache state is similar to the following:

Tag1

Data1

Tag2

Tag3

Data3

.

.

DataN

              Sixteen pieces of data exist per cache line. If half the cache lines are valid, the time required to read the array is 9/17 of the conventional time. A script that processes the cache read is written with the assumption that valid tags are followed by data, while invalid tags are followed by tags. The result is a recreation of cache information from the captured bus trace.

              The user may not require the entire CPU state to determine the root cause of a failure. For a simple test, the register state may be sufficient. For this reason, we have built a smart predicate control system through which the user may select to read/dump the arrays. This approach avoids the requirement to have multiple kernels, which could make the process difficult.

              Another method to make dumping the cache state more efficient is write coalescing (WC). Writing the cache lines out one at a time on the front-side bus is not very efficient. Instead, 16 pieces of data are bundled and sent as a burst on the front-side bus. This approach reduces bus o...