Browse Prior Art Database

Method for Limiting the Number of Ideal Huffman Tree Encodings for Prospective Data Formats

IP.com Disclosure Number: IPCOM000236473D
Publication Date: 2014-Apr-29
Document File: 3 page(s) / 52K

Publishing Venue

The IP.com Prior Art Database

Abstract

This is a method for limiting the number of ideal huffman tree encodings for prospective data formats where the limit on the number of trees is imposed due an internal limitation.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

Method for Limiting the Number of Ideal Huffman Tree Encodings for Prospective Data Formats

Optimizing compression ratios for different types of data is critical in a data retention based environment. Performing this compression operation at a high throughput allows for data compression to be injected during processing where it was previously not possible due to latency considerations.

The DEFLATE standard documented by RFC1951 provides a cross platform data compression specification. This allows for two phases of compression to be performed


1. LZ77 - Replaces repeating references within a 32 kilobyte rolling window with distance/length pairs


2. Huffman Encoding - Allows for either a documented re-encoding of data (Fixed Huffman) or a customer re-encoding of data (Dynamic Huffman) to optimize the symbol sizes.

The Huffman Encoding step can have a significant impact on the overall compression ratio. This step allows for each symbol to be re-encoding into a variable length series of bits. If the assignment of Huffman Codes is done based on frequency analysis over a subset of the compressed data then the symbols with the highest representations would always yield the smallest Huffman codes.

In order to provide the lowest latency for compression operations the DEFLATE algorithm would be implemented in hardware. In order to increase the compression ratio that a device could deliver with minimal impact to the elapsed time for performing the compression operation some level of Dynamic Huffman Encoded must be introduced. A true Dynamic Huffman Encoding could not be efficiently implemented in the hardware so a new technique was used. This would be a Canned Dynamic Huffman Encoding, where one of several pre-determined encodings would be selected for the data based on an initial sampling during the request.

The issue of determining the corre...