Browse Prior Art Database

Technique for Reading Variable Bit Codes from a Datastream

IP.com Disclosure Number: IPCOM000114047D
Original Publication Date: 1994-Nov-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 84K

Publishing Venue

IBM

Related People

Lahey, LC: AUTHOR

Abstract

Disclosed is a technique to extract variable length compression codes from a compressed datastream. Some data compression schemes (LZW, for example) use variable length codes to identify specific bit patterns in a datastream. (In LZW compression/decompression, these code start as 9-bit codes and may increase to 12-bit codes as more codes are needed to represent the bit patterns encountered in the datastream. These codes and their corresponding bit patterns are kept in a table.) Reading such a datastream (to decompress it, for example), can be slow if it requires processing the datasteam in a bitwise fashion. This technique allows the datastream to be processed in multi-byte chunks, which will speed up processing, by as much as a factor of (8 x BytesPerChunk).

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Technique for Reading Variable Bit Codes from a Datastream

      Disclosed is a technique to extract variable length compression
codes from a compressed datastream.  Some data compression schemes
(LZW, for example) use variable length codes to identify specific bit
patterns in a datastream.  (In LZW compression/decompression, these
code start as 9-bit codes and may increase to 12-bit codes as more
codes are needed to represent the bit patterns encountered in the
datastream.  These codes and their corresponding bit patterns are
kept in a table.)  Reading such a datastream (to decompress it, for
example), can be slow if it requires processing the datasteam in a
bitwise fashion.  This technique allows the datastream to be
processed in multi-byte chunks, which will speed up processing, by as
much as a factor of (8 x BytesPerChunk).  The number of bytes
processed at once is determined by the size of the codes to be read
(i.e., a 9-bit code may span at most 2 bytes, while a 10-bit code may
span 3-bytes).

      Basically, this is an imput buffer that contains the codes
packed together.  In the case of LZW compression, there is never a
code value that will span more that 4-bytes of data.  Therefore, if
4-bytes of data at a time are always processed, one is guaranteed to
have all the bits containing the code desired to extract.  The
central idea of this technique is very simple, and contains the
following steps (as illustrated in the Figure):
  1.  assume the data is not byte-aligned, so the bits of interest
       reside in up to 4 bytes
  2.  read the 4 bytes that contain the bits of interest (the bits
that
       contain the code) into a 4-byte integer.  (Note that this may
       involve some additional code to allow for byte-swapping for
some
       processors.)
  3.  Leftshift this 4-byte integer (0-filling the least significant
       bits of the byte) to position the first bit of interest to the
       most significant bit position.
  4.  Rightshift this resulting 4-byte integer (0-filling the most
       significant bits of the 4-byte integer) to position the last
bit
       of interest to the least significant bit position....