Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Huffman Code Generation

IP.com Disclosure Number: IPCOM000045569D
Original Publication Date: 1983-Apr-01
Included in the Prior Art Database: 2005-Feb-07
Document File: 4 page(s) / 22K

Publishing Venue

IBM

Related People

Deacon, JH: AUTHOR [+2]

Abstract

Huffman code generation must be done in a manner that provides optimal codes yet allows decoding to be done in an efficient manner. Definitions: ni - Number of occurrences of the event. N - Total number of events. Pi - Probability of event i. Pi = ni/N CWi - Code word variable in length. This is the value assigned by the Huffman technique for this entry. CWLi - Length of the code word (CWi) in bits. MAXCWL - Maximum code word length. It is currently set to 12. CWAi - Code word assignment - length is always MAXCWL. This is the code word (CWi) followed by enough ones to pad it out. #NUMBERS - number of numbers used in this entry.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 65% of the total text.

Page 1 of 4

Huffman Code Generation

Huffman code generation must be done in a manner that provides optimal codes yet allows decoding to be done in an efficient manner.

Definitions: ni - Number of occurrences of the event. N - Total number of events.

Pi - Probability of event i.

Pi = ni/N

CWi - Code word variable in length.

This is the value assigned by the Huffman

technique for this entry.

CWLi - Length of the code word (CWi) in bits.

MAXCWL - Maximum code word length.

It is currently set to 12.

CWAi - Code word assignment - length is always

MAXCWL. This is the code word (CWi)

followed by enough ones to pad it out.

#NUMBERS - number of numbers used in this entry.

(MAXCWL-CWLi)

#NUMBERS i - 2

For example, set CWLi = 3, MAXCWL = 12.

The first 3 bits are the CWi, the last-9

bits are "don't cares"; therefore, 2/12-3

=/9/ number of numbers used by this entry.

TOTNUMS - Cumulative total of the number of numbers

in use.

TOTNUMS - Sigma/N/ #NUMBERSi

i=1.

The average number of bits required to represent each symbol of the source alphabet is: H = - Sigma Pi log(2) Pi

i=1 From this the number of bits needed to represent a symbol is: Ti = -log(2) Pi = log(2) Pi/-1/

It is not possible to use fractions of bits, so Ti must be rounded out to the nearest integer.

This provides a first approximation, but it is not optimal. The first approximation is the initial value for CWLi. This is used to compute the #NUMBERSi value. Every entry must be given its initial values before the optimization can start.

The optimization scheme is based upin the fractional part of Ti. In other words, Ti was saved before it was rounded out to the nearest integer. The optimization consists of subtracting from every Ti some small value (e.g., 0.1). Whenever this value becomes "less than or equal to" the next lower integer, then this entry is a candidate for optimization. Then, to be optimized, the following relationship must hold:

1

Page 2 of 4

2/MAXCWL > #NUMBERSi + TOTNUMS If this is true, then CWLi is decremented by one, #NUMBERSi is added to TOTWS, and #NUMBERSi is doubled. Every entry is examined (for as many passes as necessary) until 2/MAXCWL/ = TOTNUMS.

The thrust of this is to optimize first...