Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Parallel Implementation of Algorithm Context

IP.com Disclosure Number: IPCOM000110673D
Original Publication Date: 1992-Dec-01
Included in the Prior Art Database: 2005-Mar-25
Document File: 3 page(s) / 171K

Publishing Venue

IBM

Related People

Furlan, G: AUTHOR [+2]

Abstract

The original algorithm Context [1,2,3] has a slow sequential implementation. By a reorganization of the algorithm, pipelined or parallel implementations of both the encoding and the decoding operations can be obtained. Generally speaking, the encoding part of the algorithm admits a greater degree of parallelism than the decoding part, because the encoder has the next symbol available at once.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 40% of the total text.

Parallel Implementation of Algorithm Context

       The original algorithm Context [1,2,3] has a slow
sequential implementation.  By a reorganization of the algorithm,
pipelined or parallel implementations of both the encoding and the
decoding operations can be obtained.  Generally speaking, the
encoding part of the algorithm admits a greater degree of parallelism
than the decoding part, because the encoder has the next symbol
available at once.

      In order to maximize the degree of parallelism in the
algorithm, the various calculations are decomposed and reorganized
into the following four steps:
1.  Form the context for the current symbol by the desired reordering
of the past symbols.
2.  Search for the encoding/decoding node by climbing the tree
according to the context of the current symbol.
3.  Encoding/decoding of the current symbol by a multiplication-free
arithmetic encoder/decoder with the probabilities given by the
occurrence counts at the node chosen in step 2.
4.  Update the tree by incrementing the occurrence counts of the
encoded/decoded symbol along the path in the tree, defined by the
possible contexts of the current symbol, and grow the tree as
described in the usual implementation of the algorithm.

      Of these, the first three must be done serially, while the
remaining step 4 can be done in parallel with the first three by
storing the addresses of the visited nodes in a stack. With this
arrangement both the encoding and the decoding parts are done in a
similar manner.  However, incrementing the counts for the symbol xt,
say, in parallel with processing the steps 1,2,3 for the symbol xt+1,
as called for in step 4, may cause an ambiguity in the case where xt
= xt+1, which makes the new path identical to the old.  To make sure
that the encoder/decoder always uses the same incremented counts, a
flag at each node is needed.

      In case the encoder and the decoder are in exact synchronism,
some of the (early) unincremented nodes may be used.  This permits a
further increase in the speed.  Such a possibility is particularly
useful if the Same Instruction Same Data (SISD) or Multiple
Instruction Same Data (MISD) architectures are employed.

      There is a considerable amount of freedom in implementing the
above outlined procedure.  In the following, two preferred
architecture embodiments are outlined for practical implementation.
Pipeline Structure

      This architecture uses a common memory for all the processors.
The main problem with it is the numerous memory accesses for the
different tasks.  Sharing these for pipelined tasks seems to be very
difficult, and the overhead could be considerable.  The best solution
seems to be to assign the tasks independently to several processors
accessing the same memory.  It requires a memory of much higher speed
than that of the processors implementing the different tasks.  Each
processor, then, is assigned a clock that is shifted from the...