Browse Prior Art Database

Generation of Real-Time Address Execution Histograms for Digital Signal Processors

IP.com Disclosure Number: IPCOM000110901D
Original Publication Date: 1994-Jan-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 4 page(s) / 171K

Publishing Venue

IBM

Related People

Barrett, SC: AUTHOR [+3]

Abstract

Analysis of instruction execution frequency is a useful technique for (A) enhancing the performance of application software by determining which areas of code are executed most and concentrating improvement efforts on those areas, and (B) examining the most used features and instructions of a current processor to understand where improvements can be made on future designs.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 35% of the total text.

Generation of Real-Time Address Execution Histograms for Digital
Signal Processors

      Analysis of instruction execution frequency is a useful
technique for (A) enhancing the performance of application software
by determining which areas of code are executed most and
concentrating improvement efforts on those areas, and (B) examining
the most used features and instructions of a current processor to
understand where improvements can be made on future designs.

      Capture of instruction frequency data is a difficult
proposition on current Digital Signal Processors (DSPs) executing
real application code for the following reasons:

o   DSP applications must run in real time.  It is not possible to
    slow or stop the processor between instructions to log execution
    frequency data.

o   DSPs are typically single clock cycle per execution RISC
    processors with very fast cycle times.  Hardware to monitor
    execution frequency data must be able to keep up with the CPU
    cycle time and may be very expensive to implement (requiring the
    use of custom VLSI designs in fast circuit technologies).

      Disclosed is a special-purpose pipeline approach which allows
non-invasive real time instruction execution frequency capture in a
manner which is technology independent:  the technique may be
implemented in a relatively slow technology, allowing the use of TTL
SSI/MSI or FPGA devices to prototype analysis tools for research use
or economical production of analysis tools for sale to software
developers at relatively low volume.

      As shown in the Figure, a fixed-function hardware pipeline is
implemented to capture instruction execution frequency.  This Figure
shows a three-stage pipeline which is adequate to monitor current DSP
cycle times (on the order of 50-60 nanoseconds) if implemented in TTL
or FPGA logic.  Such an implementation would require relatively
little fixed development expense (in comparison to a custom VLSI
design) and might therefore be an attractive approach to building a
low-volume product.

      The heart of the design is a 32K x 48-bit memory array.  The
maximum instruction store address space of the particular DSP being
analyzed is 32K words - the memory array effectively provides a
48-bit counter for each instruction, allowing a theoretical minimum
(assuming only one instruction location is addressed repetitively) of
more than 4600 hours of continuous data capture at a processor cycle
time of 60 nanoseconds.  Naturally, the counter width can be
decreased or increased as desired.  The 48 bits was chosen for a
baseline design to remove any practical time limits on data capture.

      The pipeline works to continuously build a "histogram" count of
instruction location execution frequency as follows:

1.  The hardware is attached to the instruction address bus of the
    DSP, monitoring address values that are sent to instruction
    memory.

2.  The same ...