Browse Prior Art Database

Recognition of Misspelled Keywords

IP.com Disclosure Number: IPCOM000078437D
Original Publication Date: 1973-Jan-01
Included in the Prior Art Database: 2005-Feb-25
Document File: 2 page(s) / 13K

Publishing Venue

IBM

Related People

Kellerman, E: AUTHOR

Abstract

An algorithm for use by compilers in recognizing misspelled keywords is described. This algorithm can be used for correcting spelling errors interactively on terminal systems. For use by batch compilers, the user can set a "threshold of similarity" which must be exceeded before the compiler corrects a misspelled keyword.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 63% of the total text.

Page 1 of 2

Recognition of Misspelled Keywords

An algorithm for use by compilers in recognizing misspelled keywords is described. This algorithm can be used for correcting spelling errors interactively on terminal systems. For use by batch compilers, the user can set a "threshold of similarity" which must be exceeded before the compiler corrects a misspelled keyword.

In order to test the algorithm, a new metalanguage symbol t was defined for use with a syntax checker.

Let SOURCE be a character vector containing the source statement being processed. Let PNTR be the location in SOURCE presently being examined.
A. Obtain a correlation coefficient between the text that

begins in position PNTR of SOURCE and each of the keywords

that may appear in that Position. When comparing the

source text against a keyword, use as much of the source

statement as necessary to match the length of the keyword,

padding on the right with blanks as required. For example,

if the source statement is:

OPEN FILE(x) RECRDPRINT;

and PNTR is 14, with the keywords that are acceptable

at this point being:

SEQUENTIAL

DIRECT

RECORD

PRINT

then, when comparing against SEQUENTIAL which has 10

letters, RECRDPRINT is used. Similarly when comparing

against DIRECT, which has 6 letters, RECRDP is used.

And so on.
B. Select the keyword which resulted in the largest

coefficient, provided the coefficient exceeded a preset

threshold value. Otherwise, ask the user which keyword

he meant.
C. Determine how many characters of...