Browse Prior Art Database

Spelling Correction Function Enhancement

IP.com Disclosure Number: IPCOM000041667D
Original Publication Date: 1984-Feb-01
Included in the Prior Art Database: 2005-Feb-02
Document File: 1 page(s) / 11K

Publishing Venue

IBM

Related People

Garrison, DA: AUTHOR [+2]

Abstract

Described is a method for enhancing the operation of a spelling correction function by arranging and searching the dictionary based on grouping root words having the same set of leading characters into records. At dictionary build time the second and third letter content of each record is determined. This information is stored in a bit mask where the i-th position corresponds to the i-th letter of the alphabet. Specifically, the i-th mask bit is a 0 if and only if there are no roots on the given record that have the i-th letter in the second or third position. Therefore, there is one bit mask for each data record in the dictionary. Normally, during the spelling correction function all the records of one letter group are read in and unfolded looking for close matches with the input word.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

Spelling Correction Function Enhancement

Described is a method for enhancing the operation of a spelling correction function by arranging and searching the dictionary based on grouping root words having the same set of leading characters into records. At dictionary build time the second and third letter content of each record is determined. This information is stored in a bit mask where the i-th position corresponds to the i-th letter of the alphabet. Specifically, the i-th mask bit is a 0 if and only if there are no roots on the given record that have the i-th letter in the second or third position. Therefore, there is one bit mask for each data record in the dictionary. Normally, during the spelling correction function all the records of one letter group are read in and unfolded looking for close matches with the input word. Using the record filter, the corresponding mask for each record is read first. Then based on the second and third characters in the input word, it is determined whether the record needs to be read. This is implemented by forming an input word mask during correction function initialization. The input mask is first set to all zeros. Then the bit(s) corresponding to the second and third input character is(are) set. This mask is "ANDed" with each record mask. A zero result implies that the associated record does not need to be read.

1