Browse Prior Art Database

Method and apparatus for analyzing the performance of multiple execution units in a processor

IP.com Disclosure Number: IPCOM000013927D
Original Publication Date: 2001-Jan-01
Included in the Prior Art Database: 2003-Jun-19
Document File: 2 page(s) / 41K

Publishing Venue

IBM

Abstract

Method and apparatus for analyzing the performance of multiple execution units in a processor Disclosed is Method and apparatus for analyzing the performance of multiple execution units in a processor. In a modern processor with multiple units of a particular type, it is desirable from a performance analysis point of view to be able to count the total number of operations of a particular type from each of the units that complete during a clock cycle. This disclosure provides a mechanism to combine events from similar units, look at the individual unit events, or any combination thereof, and reduce the amount of Performance Monitor Unit (PMU) resources.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 2

Method and apparatus for analyzing the performance of multiple execution units

in a processor

Disclosed is Method and apparatus for analyzing the performance of multiple execution units in a processor.

In a modern processor with multiple units of a particular type, it is desirable from a performance analysis point of view to be able to count the total number of operations of a particular type from each of the units that complete during a clock cycle. This disclosure provides a mechanism to combine events from similar units, look at the individual unit events, or any combination thereof, and reduce the amount of Performance Monitor Unit (PMU) resources.

For every set of unique events that can occur simultaneously in similar execution units, an adder is provided that determines the total number of that type of event that occured on each cycle. The output of the adder is used as an input to a PMU counter to control the amount the counter is incremented when it is counting the event. As an example, consider a processor with 2 floating point units. If it is desired to count the number of floating point multiplies performed in a piece of software, the PMU would be set up as shown in Figure 1.

Figure 1

Previous implementations could count the events from unit 0 or both units. This implementation provides more flexibility.

1

[This page contains 2 pictures or other non-text objects]

Page 2 of 2

2