Browse Prior Art Database

A method for fast deflate bit stream parsing using speculation Disclosure Number: IPCOM000180444D
Original Publication Date: 2009-Mar-10
Included in the Prior Art Database: 2009-Mar-10
Document File: 1 page(s) / 54K

Publishing Venue



Common compression algorithms such as Gzip employ a mixture of history based compression [1] and Huffman codes : History based compression , seeks past string patterns that are identical to the current one. Repeating patterns allow compression by for example, stating a pair (1208,50) which instruct the decompressor to "copy 50 bytes from 1208 bytes ago" instead of actually specifying the content of these 50 bytes. In Gzip these are named a length distance pair (e.g. length=1208, distance=50). Huffman coding is assigns non-uniform code lengths to symbols (or length or distances), so that useful codes are encoded using shorter bit streams. Such a decompression engine has to first decode the Huffman code which could represent literals, lengths or distances. Speeding up performance by parallelism is problematic because the division of the input bit stream is not known a-priori. Current methods perform this sequentially, thus they know where the next encoded symbol starts after decoding the current one.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 90% of the total text.

Page 1 of 1

A method for fast deflate bit stream parsing using speculation

The difficulty in processing an input byte every other cycle derives from the following facts:

    The boundary between symbol depends on the decode of the previous symbol

    A symbol could be literal or a

pair. If it is a pair it could have a long length

            - requiring additional bits, it definitely has additional bits for distance, but could have extra bits for especially long distances.

    Running the pipeline at high frequency, leaves a short period of time for the execution of each pipeline stage, making the buffer read operations and match calculation - a multiple cycle operations.

    In light of these constrains, known alternatives compromise on performance, working sequentially at a lower clock rate.

The key novelty in this invention is that it employs speculation. We observe that literals are the majority of input symbols and that most lengths do not use extra bits. (See slide 1 in the attached presentation). Our approach assumes that in most case there is no need for additional bits, and therefore the speculative bit shift can be based on the number of the code bits

    In our scheme we speculate on the amount of bits that need to be shifted. We assume that the current symbol is a literal or short length copy before knowing that is the case. The miss rate of this speculation is not minor (~20%) and it is handled efficiently as well.

Performance (note that a useful (even...