Browse Prior Art Database

Treating Low Frequency Dictionary Words As Misspellings to Increase Correction Accuracy

IP.com Disclosure Number: IPCOM000034832D
Original Publication Date: 1989-Apr-01
Included in the Prior Art Database: 2005-Jan-27

Publishing Venue

IBM

Related People

Authors:
Damerau, FJ Mays, E [+details]

Abstract

Disclosed is an improvement to spelling error detection and correction programs which treat real but rare words occurring in the spelling dictionary in a manner similar to misspelled words. This permits the use of very large dictionaries without losing the ability to detect probable errors by confusing a misspelling with a legitimate but rare word. Spelling correctors currently in use commonly look up input words in a dictionary and flag as misspelled any word not found in the dictionary list. Some of these dictionaries are very large, as much as 50,000 words, which means that some of the words contained in them are quite rare in actual usage. It is intuitively clear that as the size of a dictionary increases, the likelihood that an incorrectly spelled word will match the spelling of a correct word will rise.