Browse Prior Art Database

Data Packing and Unpacking Scheme for High Performance Image Processing

IP.com Disclosure Number: IPCOM000105339D
Original Publication Date: 1993-Jul-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 6 page(s) / 191K

Publishing Venue

IBM

Related People

Findley, RL: AUTHOR [+5]

Abstract

In image processing, it is desirable to maximize storage utilization by packing four 8-bit pixel intensities per 32-bit word. Operations on these bytes of data can only be done two at a time due to the possibility of overflow. Thus, the packing of the data into the four bytes per 32-bit word format creates a bottleneck in the system and it is necessary to have a method of unpacking the data into a usable form. This implementation offers a maximal packing rate since the data can be re-packed into this format in a single cycle and greatly increases the rate of image processing by eliminating the unpacking bottleneck.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 34% of the total text.

Data Packing and Unpacking Scheme for High Performance Image Processing

      In image processing, it is desirable to maximize storage
utilization by packing four 8-bit pixel intensities per 32-bit word.
Operations on these bytes of data can only be done two at a time due
to the possibility of overflow.  Thus, the packing of the data into
the four bytes per 32-bit word format creates a bottleneck in the
system and it is necessary to have a method of unpacking the data
into a usable form.  This implementation offers a maximal packing
rate since the data can be re-packed into this format in a single
cycle and greatly increases the rate of image processing by
eliminating the unpacking bottleneck.

BACKGROUND - Gray-scale images are stored as an array of 8-bit
intensities.  When the processing environment utilizes 32-bit words,
it is a common practice to store four 8-bit intensities per 32 bit
word as shown in Fig. 1.  This format, however, does not allow
operations to be performed on the data because of the possibility of
overflow from one byte of data corrupting the adjacent byte due to
positive or negative overflow.

      Image processing applications have inherent parallelism.  Image
routines generally perform the same operation on all pixels in the
image.  Thus, performance may be enhanced by operating on multiple
pixels simultaneously.  By packing data into the format shown in Fig.
2, two pixels per word can be operated on simultaneously without the
loss of data integrity.

The following problem is stated below:

For data packing - Some operations on data in the format shown in
Figure 2 cause an overflow to occur into the 0 fields of the result
shown in Fig. 2 as the 0 byte to the left of each pixel (A,B,C,D).
While this ensures the independence of each pixel data during
processing, the data may be corrupted  when it is packed back into
the original format shown in Fig. 1.  Thus it is imperative that the
0 fields not affect the pixels in the packed result.

      Once the 0 fields have been cleared, word 1 must be shifted
such that data A and C are in the first and third bytes,
respectively.  This will align the data so that it can be merged with
word 2.  The merge is done by logically ORing words 1 and 2 together.
When this process is complete, the data will look like that shown in
Fig. 1.

      Since the packing of data is done very frequently, it is
desirable to minimize the number of cycles required to perform this
operation.  Traditionally, this process takes several cycles to
complete.  For example, consider the implementation of the RS/6000.
Using this architecture, the image pack would require 5 instructions
and 6 cycles.  The instructions required would be:

Cycle   Instruction
   1    l     00FF00FF              (load hex constant 00FF00FF)
   2    -                           (wait in other unit)
   3    and   0A0C, 00FF00FF, GAGC  (mask of...