Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Compressed Dictionary for Word Verification

IP.com Disclosure Number: IPCOM000047397D
Original Publication Date: 1983-Nov-01
Included in the Prior Art Database: 2005-Feb-07
Document File: 3 page(s) / 39K

Publishing Venue

IBM

Related People

Grossman, RS: AUTHOR [+2]

Abstract

A compressed dictionary is formed by storing a 1 bit if a certain combination of characters occurs at certain pairs of character positions in any word in the dictionary. A word is compared with the dictionary entries by first forming it into the character pairs and then testing for a 1 or 0 bit in the corresponding location in the dictionary. A result of all 1 bits is a match. The dictionary is compressed because certain character pairs occur at particular positions in many words of most dictionaries, but only one bit position of storage is used. The system is effective because these common character pairs contribute relatively little to distinguishing one word from another. Background There are many applications where an input word is checked against a list of valid input words.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Compressed Dictionary for Word Verification

A compressed dictionary is formed by storing a 1 bit if a certain combination of characters occurs at certain pairs of character positions in any word in the dictionary. A word is compared with the dictionary entries by first forming it into the character pairs and then testing for a 1 or 0 bit in the corresponding location in the dictionary. A result of all 1 bits is a match. The dictionary is compressed because certain character pairs occur at particular positions in many words of most dictionaries, but only one bit position of storage is used. The system is effective because these common character pairs contribute relatively little to distinguishing one word from another. Background There are many applications where an input word is checked against a list of valid input words. As familiar examples, in the procedure for logging on to a data processing system, the identification of a user of the system may be checked against a list of authorized users; the license of a car may be checked against a list of stolen cards, or a word may be checked against a dictionary of correctly spelled words. In some systems, the list contains each entry in full, and an input is unambiguously matched or not matched. There are also applications where absolute accuracy is not needed and the words can be stored in a compressed form. Compressing the list can simplify the storage and/or logic requirements of the system. Implementation As an example, consider how the dictionary stores the word "cat". First a terminator (*) is appended to the word to form "cat*". The word is then formed into character pairs c-a, a-t and t-*. For each of these pairs there is a separate storage array. The row heading is the first character of a pair, and the column heading is the second character of a pair. Each array has a one-bit column position and a one-bit row position for each character of the alphabet. A bit position in the array is a 1 if the corresponding pair of characters is valid in some word in the dictionary. The terminator appears only as the second character of a pa...