Browse Prior Art Database

Acceleration of Multimedia Applications using Branch Conditional to Previous Target Instruction

IP.com Disclosure Number: IPCOM000112194D
Original Publication Date: 1994-Apr-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 4 page(s) / 128K

Publishing Venue

IBM

Related People

Karim, F: AUTHOR [+5]

Abstract

Multimedia applications often have heavily branch-dependent code. A common pattern involves a series of branches (checking different conditions) to the same target address.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Acceleration of Multimedia Applications using Branch Conditional
to Previous Target Instruction

      INTRODUCTION

Multimedia applications often have heavily branch-dependent code.  A
common pattern involves a series of branches (checking different
conditions) to the same target address.

      A new instruction, branch conditional to previous target (bc=),
allows consecutive conditional branches to be processed without
delays between them if they all share the same target address.  This
is accomplished by checking all the conditions for each branch
simultaneously.  Whenever one of the conditions is resolved in favor
of taking the branch, the rest of the unresolved conditions do not
need to be processed.

EXAMPLE PROBLEMS

The following two examples are from image processing and graphics
code.

Example 1 is from the JPEG image decompression algorithm.

  m1 = 0xFF000000

  m2 = 0x00FF0000

  m3 = 0x0000FF00

  m4 = 0x000000FF
       .
       .

  /* Inner loop */

  if ((abcd & m1) == m1) ||

      (abcd & m2) == m2) ||

      (abcd & m3) == m3) ||

      (abcd & m4) == m4) Eval_Escape_Code();

  else { .   /* Decompress word */
            .
            .

'FF' in a byte signals an escape code to the JPEG algorithm.  The
above code detects the presence of an 'FF' in any byte in a word.

      This example, when compiled, will result in four consecutive
compares and branches to the same target address.  Bc= can be used
here to reduce branch penalties significantly.

Example 2 is from a typical 2-D graphics clipping algorithm.

  compare x, wx

  compare y, wy

  compare x, -wx

  compare y, -wy

  branch_on_x_wx

  branch_on_y_wy

  branch_on_x_-wx

  branch_on_y_-wx

      The above code has four compare instructions which have the
same target address, like the JPEG image decompression algorithm.
For conventional processors, branch_on_y_wy will not be processed
until branch_on_x_wx gets resolved.  Similarly, the later branches
must wait until each earlier branch resolves.  Even though the target
address for the branch is the same, the target address is regenerated
as each branch instruction is processed.

      Fig. 1 shows how a processor with two fixed point units
executes the four consecutive regular branch instructions, assuming
all branches not taken except the last one.  With this method, a
six-cycle branch penalty is incurred.

      Using branch conditional to previous target (bc=) instructions,
the code can be executed with a much smaller branch penalty.  Fig. 2
shows how a processor with two fixed point units executes a branch
instruction followed by three consecutive bc= instructions, again
assuming all branches not taken except the l...