Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Partial Inner Loop Instruction Repeat Buffer

IP.com Disclosure Number: IPCOM000018220D
Original Publication Date: 2003-Jul-23
Included in the Prior Art Database: 2003-Jul-23
Document File: 5 page(s) / 112K

Publishing Venue

Motorola

Related People

Larry Richard Tate: AUTHOR [+2]

Abstract

A.The Problem - Instruction fetch power is a significant component of overall power consumption in general purpose DSPs. In many real world DSP applications, 20% of the instructions consume 80% of the available machine cycles. The other 80% of the instructions consume the remaining 20% of machine cycles. Many of the most processing intensive algorithms will be implemented as highly optimized code inside loops. The innermost loops instructions execute with much higher frequency than the outer loop instructions. Most of the instruction fetch power consumption from inner loops. The problem is to reduce overall fetch power. Fetch power must be reduced while maintaining full programmability and not inconveniencing the programmer.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 44% of the total text.

Partial Inner Loop Instruction Repeat Buffer

by Larry Richard Tate and James W. Stroming

A.       The Problem - Instruction fetch power is a significant component of overall power consumption in general purpose DSPs.� In many real world DSP applications, 20% of the instructions consume 80% of the available machine cycles.� The other 80% of the instructions consume the remaining 20% of machine cycles. Many of the most processing intensive algorithms will be implemented as highly optimized code inside loops. The innermost loops instructions execute with much higher frequency than the outer loop instructions.� Most of the instruction fetch power consumption from inner loops.� The problem is to reduce overall fetch power.� Fetch power must be reduced while maintaining full programmability and not inconveniencing the programmer.

B.     The current solution doesn’t resolve the problem.� The Lucent DSP16xx and 16xxx families employ multiple instruction repeat buffers which implement zero overhead hardware loops through explicit “DO K { instructions}” instruction. The purpose is to allow one addressing unit to be multiplexed between instructions and data without incurring too many wait states.� Buffer size is fixed so that zero overhead looping is only possible for loops of limited size.� If the entire loop does not fit into the repeat buffer then the repeat buffer cannot be used at all with the Lucent approach.� Nested zero overhead loops are not supported.� Each instruction which accesses 2 data operands incurs a memory stall on the first pass through the loop.

C.       The new solution is a zero overhead looping mechanism which implements all or part of a set of nested loops using an instruction repeat buffer.� When the code size exceeds the available buffer size for a particular nested loop, the new solution simply fetches from memory.� It resumes fetching from the instruction repeat buffer when the code wraps back to the beginning of the loop.� The buffer is supported by a full set of addressing units and busses to instruction memory so no wait states occur.� It eliminates stalls caused by multiple attempted accesses to the same� memory bank and reduces fetch power consumption by fetching a large percentage of instructions from the buffer (register file) located at the core’s fetch controller.� This eliminates the array access and� bus power needed to bring the instruction to the core’s fetch controller.

Figure 1 illustrates the use of the fetch set buffer for the innermost loop:

Enable buffer bit is set for the particular loop

Loop iteration count (LC1) is greater than or equal to two (LC >= 2)

Instructions will be loaded into the instruction buffer as they are executed on the first pass through the loop

When the controller detects that the last counter address in the loop has been reached then the buffer counter (BAC) is reset

Figure 1 – Fetch Set Buffer for Inner Most Loop

Last counter address detection sets the AB status bit based...