Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Programmatic Control of Hardware Performance Event Multiplexing

IP.com Disclosure Number: IPCOM000216947D
Publication Date: 2012-Apr-25
Document File: 3 page(s) / 24K

Publishing Venue

The IP.com Prior Art Database

Abstract

A "switch event" function that switches event collection statistics to a new set of monitored events is disclosed.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 42% of the total text.

Page 01 of 3

Programmatic Control of Hardware Performance Event Multiplexing

Disclosed is a "switch event" function that switches collecting statistics to a new set of hardware events.

Performance analysts need to collect accurate performance data as quickly and as accurately as possible in order to characterize an application or workload. The characterization should be reproducible. This is especially important for high performance computing (HPC) with scientific/technical workloads (as contrasted with commercial) applications. The objective is to minimize the amount of time needed to accurately collect hardware performance event statistics for a given workload.

For some time many processor chips have included circuitry to monitor hardware performance. The hardware performance monitor unit (HPM or PMU) supports a limited number of events at the same time. Using this hardware to collect performance statistics on application workloads is preferred over other software-based methods. Utilizing the hardware support minimizes the impact of recording performance events while running the application. It minimizes perturbations to the statistics that are collected while not significantly increasing the time to run the workload being analyzed.

Modern processors are capable of monitoring dozens, even hundreds, of different processor events. However, while the PMU can track many events, it only has a limited number of hardware counters to use at any given time. For example, in the case of

Power® 5 processor there are 6 counters available for collecting statistics, 2 fixed and 4 programmable. A commodity processor, like the Pentium® III, is limited even further to only two PMU counters [1] but can collect over 80 events [2]. These events are collected together into groups and made available through tools provided by the operating system. For pSeries® processors these are the performance monitor application programming interface (PMAPI) library under AIX® 5L and the PAPI library running Linux® kernels.

Hundreds of events (dozens of event groups) need to be collected to completely characterize a given application workload. In an example, 89 groups are recommended to characterize workload performance for systems based on the Power 5 processor.

Performance statistics characterize an application workload's behavior and can be used for:

Measuring how effectively the workload uses computer hardware resources.


1.

(Something a customer managing the computer installation might be interested in studying.)

Identifying opportunities to optimize programs that are considered to be performing


2.

poorly. (Something an application developer might be interested in studying.) Projecting the workloads' performance onto future hardware. (Something a technical


3.

salesperson might be interested in studying.)

1


Page 02 of 3


4.

Identifying CPU bottlenecks that hinder current performance and direct design efforts for future hardware. (Something a hardware designer might be i...