Browse Prior Art Database

Data Compaction and Security System

IP.com Disclosure Number: IPCOM000076150D
Original Publication Date: 1972-Jan-01
Included in the Prior Art Database: 2005-Feb-24
Document File: 3 page(s) / 52K

Publishing Venue

IBM

Related People

Cocke, J: AUTHOR [+2]

Abstract

Data compaction and data security are simultaneously achieved by a variable-state coding system that makes a character-by-character random selection of encoding tables for compacting the input data and a corresponding random selection of decoding tables, such selections being determined in each instance by a combination of variable factors including the current coding state of the system, the group membership of the character just processed, certain specified transition probabilities among states and a pseudorandom sequence of numbers selected by a random-number generator.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 49% of the total text.

Page 1 of 3

Data Compaction and Security System

Data compaction and data security are simultaneously achieved by a variable-state coding system that makes a character-by-character random selection of encoding tables for compacting the input data and a corresponding random selection of decoding tables, such selections being determined in each instance by a combination of variable factors including the current coding state of the system, the group membership of the character just processed, certain specified transition probabilities among states and a pseudorandom sequence of numbers selected by a random-number generator.

The implementation of this concept involves, first, defining a plurality of coding states for data compaction purposes. Next, there is defined for each of these states (by known methods) a variable-length encoding scheme whereby each input character is converted to a code word, whose length is inversely related to the relative frequency with which that particular character is expected to occur in the data base or state under consideration.

To provide a simple illustration, assume that the entire alphabet consists of only four characters, a, b, c and d, and that there are three different states 1, 2 and 3. Letting P(a), P(b), etc. represent the occurrence probabilities of characters a, b, etc., respectively, the following table will define three such illustrative states: Occurrence Probability State 1 State 2 State 3 p(a) 0.50
0.125 0.25 P(b) 0.25 0.50 0.25 P(c) 0.125 0.25 0.25 P(d)
0.125 0.125 0.25

The encoding table established for each of these three states will convert the various input characters to code words, whose lengths are inversely related to the occurrence probabilities of the respective characters in the assumed data base. The most frequently occurring characters therefore are represented by the shortest code words, thus achieving a smaller average code word length than would be possible in a system made up of fixed-length code words only. An exception to this coding principle would occur in a state wherein all characters occur with equal probability, in which case all code words have a uniform length. The following table depicts an illustrative set of encoding tables for the various assumed states 1, 2 and 3: CODE WORDS Input Character State 1 State 2 State 3 a 0 111 00 b 10 0 01 c 110 10 10 d 111 110 11

Decoding is accomplished by decoding tables that convert the code words to their original characters, there being one such table for each state.

Data compaction and security are provided by the ability of the system to effect transitions from state-to-state in a randomly selected fashion at both the encoding and decoding ends. Each transition is governed in part by the identity of the current coding state, the magnitude of a randomly generated number and an assumed table of transition probabilities among states. Further variation is provided by segregating the characters into groups, the respective memberships of...